STOP WASTING PAID TOKENS. START POOLING ACCOUNTS TODAY. [ GET YOUR VAULT ]

Rate Limits & Budgets

SpiderGate enforces limits per agent token, not per workspace — so each agent has its own request rate, spend cap, and model allow-list. This page explains each limit, the status code it raises, and the headers that report your standing.

The limits

::table
Limit | Set on the token as | Enforced at | When exceeded
Requests per minute | a rate-limit RPM value | request time | `429` `rate_limit_exceeded`
Requests per day | a rate-limit RPD value | request time | `429` `daily_rate_limit_exceeded`
Monthly budget | a USD spend cap (soft warns, hard blocks) | request time | `402` `budget_exceeded`
Allowed models | an allow-list of models / aliases | request time | `403` `model_not_allowed`
Free models only | a flag restricting to free-tier models | request time | `403` `model_not_allowed`

All of these are configured when you mint or edit a token in Agent Keys.

Rate limits

Each token has a per-minute and a per-day request ceiling. Exceed either and the request is rejected with 429 and these headers:

  • Retry-After — seconds to wait before retrying.

  • X-RateLimit-Limit-Requests / X-RateLimit-Remaining-Requests — your ceiling and what's left.

  • X-RateLimit-Reset — when the window resets.

Honor Retry-After with exponential backoff rather than retrying immediately.

HTTP/1.1 429 Too Many Requests
Retry-After: 12
X-RateLimit-Limit-Requests: 60
X-RateLimit-Remaining-Requests: 0

Budgets

A token can carry a monthly USD budget that resets each month. A soft budget emits a warning but lets the request through; a hard budget blocks with 402 budget_exceeded once spend reaches the cap. The error message reports current spend against the cap.

Spend is tracked per model on each token, so the dashboard shows not just whether a token is near its cap but which models are driving the cost.

Model restrictions

A token can be limited to an explicit allow-list of models or aliases, or to free-tier models only. A request for anything outside that set returns 403 model_not_allowed, and the message lists what the token is allowed to use. Widen the list in Agent Keys, or switch the request to an allowed model.

Designing limits for agents

A few patterns:

  • Untrusted or experimental agents — a tight RPM, a low monthly budget, and free models only. A runaway loop can't cost you much.

  • Production agents — a budget that matches expected monthly spend (hard-blocking), an RPM sized to real traffic, and an allow-list of the models you've validated.

  • Per-customer isolation — one token per customer, each with its own budget, so spend and limits never bleed across tenants.

Next steps

  1. Set limits when you mint a token — Agent Keys.

  2. Handle the 402/403/429 responses — Errors.

  3. Watch spend per agent — Traces.