Skip to content

Budget Management Guide

Budget Management Overview

The AI Security Gateway provides comprehensive budget management features to control LLM API costs. Budget limits can be set at multiple levels (proxy, user, user group, organization) with automated enforcement, warning notifications, and detailed cost tracking.

Budget Management Dashboard

Key Features

  • Monthly Budgets: Set monthly spending limits with automatic reset
  • Multi-Level Budgets: Configure budgets for proxies, users, user groups, and organization
  • Warning Thresholds: Email notifications at 50%, 75%, 90%, and 100% usage
  • Automatic Enforcement: Block requests when budget is exceeded
  • Cost Attribution: Track costs per user, model, and proxy
  • Real-Time Tracking: Monitor spending in real-time with WebSocket updates
  • Budget Reset: Automatic monthly reset or manual reset on-demand
  • Historical Reports: View budget usage trends and patterns

Budget Hierarchy

Budgets are enforced in a hierarchical order:

  1. Organization Budget - Total spending limit for entire organization (Planned and still working on)
  2. User Group Budget - Spending limit for specific user groups (e.g., "Engineering Team")
  3. User Budget - Individual user spending limit (Planned and still working on)
  4. Proxy Budget - Spending limit for specific LLM proxy (Planned and still working on)

Enforcement Logic: A request is allowed only if ALL applicable budgets have remaining quota. If any budget is exceeded, the request is blocked.

Example:

  • Organization budget: $10,000/month
  • Engineering team budget: $3,000/month
  • John Doe's user budget: $500/month
  • OpenAI proxy budget: $5,000/month

When John (Engineering team) makes a request through the OpenAI proxy, ALL four budgets are checked. The request is blocked if ANY budget is exceeded.


Budget Enforcement

How Budgets Are Enforced

  1. Pre-Request Check: Before proxying an LLM request, the Gateway checks all applicable budgets
  2. Budget Hierarchy: Organization → User Group → User → Proxy budgets are all checked
  3. Cost Estimation: Request is estimated based on model pricing (prompt tokens × input rate + completion tokens × output rate)
  4. Blocking Decision: If ANY budget would be exceeded, the request is blocked
  5. Usage Recording: Successful requests update all applicable budget usage counters

Cost Calculation

The Gateway uses model-specific pricing to estimate costs:

ModelInput (per 1M tokens)Output (per 1M tokens)
gpt-4o$2.50$10.00
gpt-4o-mini$0.15$0.60
gpt-4-turbo$10.00$30.00
claude-3-5-sonnet$3.00$15.00
claude-3-5-haiku$0.80$4.00

Custom Pricing: Add custom model pricing via SettingsSystemModel Pricing.

Blocked Request Response

When a budget is exceeded, the Gateway returns:

json
{
  "error": {
    "message": "Budget limit exceeded",
    "type": "budget_exceeded",
    "code": 429,
    "details": {
      "budget_type": "user",
      "budget_limit": 500.00,
      "current_usage": 498.23,
      "estimated_cost": 2.45,
      "reset_date": "2026-02-01T00:00:00Z"
    }
  }
}

Get Budget Status via API

bash
# Get organization budget status
curl -X GET http://localhost:8080/api/v1/budgets/organization/status \
  -H "Authorization: Bearer YOUR_TOKEN"

# Get user budget status
curl -X GET http://localhost:8080/api/v1/budgets/users/1/status \
  -H "Authorization: Bearer YOUR_TOKEN"

# Get usage breakdown
curl -X GET http://localhost:8080/api/v1/budgets/organization/breakdown?period=this_month \
  -H "Authorization: Bearer YOUR_TOKEN"

Budget Alerts

Warning Thresholds

Budget warnings are sent when usage reaches configured thresholds:

  • 50% Warning: Early notification to monitor usage
  • 75% Warning: Increased awareness of budget consumption
  • 90% Warning: Critical warning - budget nearly exhausted
  • 100% Alert: Budget exceeded - requests are now blocked

Email Notifications (WIP)

Warning emails include:

  • Budget Type: Which budget reached threshold (org, user group, user, proxy)
  • Current Usage: Total spending this month
  • Remaining Budget: Amount remaining
  • Threshold: Which threshold was reached (50%, 75%, 90%, 100%)
  • Top Consumers: Users/models consuming the most budget
  • Recommendations: Suggested actions to manage costs
  • Reset Date: When budget will automatically reset

Example Email:

Subject: ⚠️ Budget Alert: 75% of Organization Budget Consumed

Your organization has reached 75% of the monthly budget limit.

Budget Details:
- Monthly Limit: $10,000.00
- Current Usage: $7,523.45
- Remaining: $2,476.55
- Reset Date: 2026-02-01

Top Consumers:
1. gpt-4o: $3,245.12 (43%)
2. claude-3-5-sonnet: $2,987.54 (40%)
3. gpt-4-turbo: $1,290.79 (17%)

Recommendations:
- Review high-cost model usage
- Consider switching to more cost-effective models
- Set user-level budgets to distribute costs

View Details: https://gateway.example.com/budgets

Configuring Notifications

  1. Navigate to OperationsBudget Management
  2. Select budget (organization, user group, user, or proxy)
  3. Click Edit Budget
  4. Update Notification Emails field
  5. Toggle warning thresholds (50%, 75%, 90%, 100%)
  6. Click Save Budget

Budget Reset (WIP)

Automatic Monthly Reset

Budgets automatically reset on the first day of each month at 00:00 UTC.

Reset Behavior:

  • Usage Counter: Reset to $0.00
  • Budget Limit: Unchanged (carries over)
  • Warning Thresholds: Re-enabled
  • Blocked Requests: Automatically unblocked

Manual Budget Reset

Administrators can manually reset budgets at any time.

Via Web Interface

  1. Navigate to OperationsBudget Management
  2. Select budget to reset
  3. Click Reset Budget (⟲ icon)
  4. Confirm reset
  5. Budget usage immediately resets to $0.00

Via API

bash
# Reset organization budget
curl -X POST http://localhost:8080/api/v1/budgets/organization/reset \
  -H "Authorization: Bearer YOUR_TOKEN"

# Reset user budget
curl -X POST http://localhost:8080/api/v1/budgets/users/1/reset \
  -H "Authorization: Bearer YOUR_TOKEN"

# Reset all budgets (requires admin)
curl -X POST http://localhost:8080/api/v1/budgets/reset-all \
  -H "Authorization: Bearer YOUR_TOKEN"

Warning: Manual reset clears usage history for the current period. Use with caution.


Budget Reports (WIP - not implemented yet)

Generating Reports

Generate Budget Report via Web Interface

  1. Navigate to OperationsBudget ManagementReports
  2. Configure report parameters:
    • Report Type: Summary, Detailed, Cost Breakdown
    • Time Period: This Month, Last Month, Last 90 Days, Custom
    • Budget Scope: Organization, User Group, User, Proxy, All
    • Format: PDF, CSV, JSON
  3. Click Generate Report
  4. Download report file

Generate Budget Report via API

bash
# Generate monthly summary report
curl -X GET "http://localhost:8080/api/v1/budgets/reports/summary?period=this_month&format=pdf" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -o budget-report.pdf

# Generate detailed cost breakdown (CSV)
curl -X GET "http://localhost:8080/api/v1/budgets/reports/breakdown?period=last_month&format=csv" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -o budget-breakdown.csv

Report Contents

Summary Report includes:

  • Total spending for period
  • Budget utilization percentage
  • Cost breakdown by model
  • Cost breakdown by user
  • Cost breakdown by proxy
  • Top 10 most expensive requests
  • Budget alert history

Detailed Report includes all summary data plus:

  • Individual request details
  • Token usage per request
  • Cost per request
  • User attribution
  • Timestamp and duration
  • Model and parameters used

Cost Breakdown Report includes:

  • Hourly/daily/monthly aggregations
  • Cost trends and forecasts
  • Budget vs. actual spending
  • Efficiency metrics
  • Cost optimization recommendations

Budget Best Practices

Setting Appropriate Limits

  1. Start Conservative: Begin with lower budgets and increase as needed
  2. Monitor Usage: Review first month's usage to establish baseline
  3. Add Buffer: Set budgets 20-30% above expected usage
  4. Tiered Approach: Use hierarchy (org > group > user) for granular control

Cost Optimization

  1. Use Cost-Effective Models: Switch to mini/haiku models for simple tasks
  2. Implement Caching: Use semantic caching to reduce redundant requests
  3. Optimize Prompts: Reduce prompt length without sacrificing quality
  4. Set Max Tokens: Limit completion length to prevent runaway costs
  5. User Education: Train users on cost-conscious LLM usage

Monitoring and Alerts

  1. Enable All Thresholds: Configure 50%, 75%, 90%, 100% alerts
  2. Multiple Recipients: Send alerts to team leads, finance, and admins
  3. Regular Reviews: Weekly review of top consumers
  4. Trend Analysis: Monitor month-over-month spending trends
  5. Anomaly Detection: Investigate sudden spending spikes

Budget Troubleshooting

Budget Not Enforcing

Problem: Requests continue even after budget is exceeded.

Solution:

  1. Verify budget is Enabled (toggle is ON)
  2. Check Block at 100% is enabled
  3. Confirm user/proxy is within budget scope
  4. Review audit logs for budget check results:
    bash
    curl -X GET "http://localhost:8080/api/v1/audit/logs?event_type=budget_check" \
      -H "Authorization: Bearer YOUR_TOKEN"
  5. Verify model pricing is configured (missing pricing = $0 cost = no enforcement)

Incorrect Cost Calculations

Problem: Reported costs don't match actual LLM provider bills.

Solution:

  1. Verify model pricing is up-to-date: SettingsSystemModel Pricing
  2. Check for custom models with missing pricing
  3. Enable Langfuse Integration for accurate token tracking
  4. Compare token counts with provider dashboard
  5. Review request logs for discrepancies:
    bash
    curl -X GET "http://localhost:8080/api/v1/proxies/1/logs?include_tokens=true" \
      -H "Authorization: Bearer YOUR_TOKEN"

Missing Budget Alerts

Problem: Not receiving budget warning emails.

Solution:

  1. Verify email integration is configured: SettingsIntegrationsEmail
  2. Test email with Send Test Email button
  3. Check spam/junk folders
  4. Verify notification emails are correct in budget configuration
  5. Check budget warning thresholds are enabled
  6. Review audit logs for email delivery status:
    bash
    curl -X GET "http://localhost:8080/api/v1/audit/logs?event_type=email_sent" \
      -H "Authorization: Bearer YOUR_TOKEN"

Budget Not Resetting

Problem: Budget usage didn't reset on the first of the month.

Solution:

  1. Verify Gateway was running at midnight UTC on the 1st
  2. Check system logs for reset job execution:
    bash
    journalctl -u ai-security-gateway | grep "budget reset"
  3. Manually trigger reset via API:
    bash
    curl -X POST http://localhost:8080/api/v1/budgets/organization/reset \
      -H "Authorization: Bearer YOUR_TOKEN"
  4. Check system timezone is set to UTC
  5. Review cron job configuration if using systemd timer

Hierarchical Budget Conflicts

Problem: Unable to determine which budget is blocking requests.

Solution:

  1. Enable verbose logging: SettingsSystemLog Level = Debug
  2. Review request logs with budget check details:
    bash
    curl -X GET "http://localhost:8080/api/v1/proxies/1/logs?include_budget_check=true" \
      -H "Authorization: Bearer YOUR_TOKEN"
  3. Check budget status for all applicable levels:
    • Organization: /api/v1/budgets/organization/status
    • User Group: /api/v1/budgets/user-groups/{id}/status
    • User: /api/v1/budgets/users/{id}/status
    • Proxy: /api/v1/budgets/proxies/{id}/status
  4. Look for the first budget with remaining_budget <= 0

Advanced Features

Budget Forecasting

The Gateway can forecast when budgets will be exceeded based on current usage trends.

Via Web Interface

  1. Navigate to OperationsBudget ManagementForecasting
  2. Select budget to forecast
  3. View forecast chart showing:
    • Current usage trend
    • Projected spending
    • Estimated exhaustion date
    • Recommended budget adjustments
  4. Configure alerts for projected overruns

Get Budget Forecast via API

bash
curl -X GET http://localhost:8080/api/v1/budgets/organization/forecast \
  -H "Authorization: Bearer YOUR_TOKEN"

Response:

json
{
  "current_usage": 5234.12,
  "budget_limit": 10000.00,
  "days_remaining": 18,
  "projected_total": 11245.67,
  "projected_overrun": 1245.67,
  "projected_exhaustion_date": "2026-01-24T00:00:00Z",
  "daily_average": 290.78,
  "recommendations": [
    "Current pace will exceed budget by $1,245.67",
    "Consider increasing budget to $12,000/month",
    "Reduce gpt-4-turbo usage by 30% to stay within budget"
  ]
}

Budget Sharing

Allow user groups to share a common budget pool.

  1. Navigate to Access ControlUser Groups
  2. Select multiple user groups (Ctrl+Click or Cmd+Click)
  3. Click Share Budget
  4. Configure shared budget:
    • Shared Limit: Total limit for all selected groups
    • Allocation: Equal, Weighted, or Custom per group
    • Overflow: Allow groups to borrow from others
  5. Click Create Shared Budget

Budget Rollover

Unused budget can roll over to the next month (up to a maximum).

  1. Navigate to OperationsBudget ManagementRollover Settings
  2. Enable Budget Rollover
  3. Configure:
    • Max Rollover: Maximum amount to carry over (e.g., 20% of monthly limit)
    • Expiration: How long rollover credit lasts (1-12 months)
  4. Click Save Settings

Example:

  • Monthly budget: $10,000
  • Usage in January: $8,000
  • Rollover amount: $2,000 (capped at 20% = $2,000)
  • February budget: $10,000 + $2,000 = $12,000