STOP WASTING PAID TOKENS. START POOLING ACCOUNTS TODAY. [ GET YOUR VAULT ]

Chat Completions

POST /api/gate/v1/chat/completions is the core of SpiderGate. It takes the standard OpenAI chat-completion request and returns the standard response — the only SpiderGate-specific touch is that model can be a task alias.

curl -X POST "https://spideriq.ai/api/gate/v1/chat/completions" \
  -H "Authorization: Bearer $SPIDERIQ_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "spideriq/coding",
    "messages": [
      {"role": "system", "content": "You are a senior Python engineer."},
      {"role": "user", "content": "Write a function to debounce calls."}
    ],
    "temperature": 0.2,
    "max_tokens": 400
  }'

This endpoint requires authentication. See Authentication.

Required parameters

  • model (string, required) — A task alias (e.g. spideriq/coding) or a concrete model id (e.g. gpt-4o).

  • messages (array, required, at least one) — The conversation. Each message has a role (system, user, assistant, tool, or function) and content. content is a string, or an array of parts for multi-modal input ({"type": "text", ...} / {"type": "image_url", ...}).

Common parameters

These mirror the OpenAI API exactly. The most-used ones:

::table
Parameter | Type | Default | Notes
`temperature` | number | `1.0` | Sampling temperature, `0.0`–`2.0`.
`top_p` | number | `1.0` | Nucleus sampling, `0.0`–`1.0`.
`n` | integer | `1` | Number of choices to generate, `1`–`128`.
`max_tokens` | integer | — | Max tokens to generate (`≥ 1`). `max_completion_tokens` is an alias.
`stream` | boolean | `false` | Stream the response over SSE. See [Streaming](/docs/using-the-gateway/streaming).
`stop` | string or array | — | Up to a few stop sequences.
`presence_penalty` | number | `0.0` | `-2.0`–`2.0`.
`frequency_penalty` | number | `0.0` | `-2.0`–`2.0`.
`seed` | integer | — | Best-effort deterministic sampling.
`user` | string | — | An end-user identifier for your own attribution.

Tool calling

Pass tools (function definitions) and tool_choice to let the model call your functions. The shape is the OpenAI tools format.

curl -X POST "https://spideriq.ai/api/gate/v1/chat/completions" \
  -H "Authorization: Bearer $SPIDERIQ_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "spideriq/tool-use",
    "messages": [{"role": "user", "content": "What is the weather in Berlin?"}],
    "tools": [{
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get the current weather for a city",
        "parameters": {
          "type": "object",
          "properties": {"city": {"type": "string"}},
          "required": ["city"]
        }
      }
    }],
    "tool_choice": "auto"
  }'
  • tools (array) — Function definitions (type: "function", function: {name, description, parameters}).

  • tool_choice"none", "auto" (default when tools are present), "required", or a specific function.

  • parallel_tool_calls (boolean, default true) — Allow multiple tool calls in one turn.

Tip: For heavy tool inventories or strict JSON output, prefer spideriq/tool-use or agent/tool-use — their chains lead with models tuned for function calling.

Structured output

Use response_format to constrain the output to JSON.

{
  "model": "spideriq/extraction",
  "messages": [{"role": "user", "content": "Extract the company name: Acme Corp ships widgets."}],
  "response_format": {"type": "json_object"}
}

type is one of text, json_object, or json_schema (pass your schema in json_schema).

The response

A non-streaming response is the OpenAI chat.completion object:

  • idchatcmpl-…

  • object"chat.completion"

  • created — Unix timestamp

  • model — the actual model that served the request (not the alias you sent)

  • choices — each with index, message (role, content, optional tool_calls), and finish_reason

  • usageprompt_tokens, completion_tokens, total_tokens

SpiderGate response signals

  • X-SpiderGate-Fallback-From header — present on non-streaming responses when the served model differs from the alias's first choice. Its value is that first-choice model, so you can tell when a fallback happened.

  • route_trace — add ?include_route_trace=true to the URL and the response body includes a per-attempt trace (each model tried, the outcome, and latency). Useful for debugging routing.

curl -X POST "https://spideriq.ai/api/gate/v1/chat/completions?include_route_trace=true" \
  -H "Authorization: Bearer $SPIDERIQ_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"model": "agent/chat", "messages": [{"role": "user", "content": "hi"}]}'

Errors

This endpoint can return 400 (invalid request), 401 (auth), 402 (budget exceeded), 403 (model not allowed for your token), 429 (rate limit), and 503 (engine unavailable). Each has a structured body — see Errors.

Next steps

  1. Add Streaming for responsive UIs.

  2. Choose the right alias in Task Aliases.

  3. Route to a specific model with Models & Direct Routing.