Rate Limits & Budgets
SpiderGate enforces limits per agent token, not per workspace — so each agent has its own request rate, spend cap, and model allow-list. This page explains each limit, the status code it raises, and the headers that report your standing.
The limits
::table
Limit | Set on the token as | Enforced at | When exceeded
Requests per minute | a rate-limit RPM value | request time | `429` `rate_limit_exceeded`
Requests per day | a rate-limit RPD value | request time | `429` `daily_rate_limit_exceeded`
Monthly budget | a USD spend cap (soft warns, hard blocks) | request time | `402` `budget_exceeded`
Allowed models | an allow-list of models / aliases | request time | `403` `model_not_allowed`
Free models only | a flag restricting to free-tier models | request time | `403` `model_not_allowed`All of these are configured when you mint or edit a token in Agent Keys.
Rate limits
Each token has a per-minute and a per-day request ceiling. Exceed either and the request is rejected with 429 and these headers:
Retry-After— seconds to wait before retrying.X-RateLimit-Limit-Requests/X-RateLimit-Remaining-Requests— your ceiling and what's left.X-RateLimit-Reset— when the window resets.
Honor Retry-After with exponential backoff rather than retrying immediately.
HTTP/1.1 429 Too Many Requests
Retry-After: 12
X-RateLimit-Limit-Requests: 60
X-RateLimit-Remaining-Requests: 0Budgets
A token can carry a monthly USD budget that resets each month. A soft budget emits a warning but lets the request through; a hard budget blocks with 402 budget_exceeded once spend reaches the cap. The error message reports current spend against the cap.
Spend is tracked per model on each token, so the dashboard shows not just whether a token is near its cap but which models are driving the cost.
Model restrictions
A token can be limited to an explicit allow-list of models or aliases, or to free-tier models only. A request for anything outside that set returns 403 model_not_allowed, and the message lists what the token is allowed to use. Widen the list in Agent Keys, or switch the request to an allowed model.
Designing limits for agents
A few patterns:
Untrusted or experimental agents — a tight RPM, a low monthly budget, and
free models only. A runaway loop can't cost you much.Production agents — a budget that matches expected monthly spend (hard-blocking), an RPM sized to real traffic, and an allow-list of the models you've validated.
Per-customer isolation — one token per customer, each with its own budget, so spend and limits never bleed across tenants.
Next steps
Set limits when you mint a token — Agent Keys.
Handle the
402/403/429responses — Errors.Watch spend per agent — Traces.