STOP WASTING PAID TOKENS. START POOLING ACCOUNTS TODAY. [ GET YOUR VAULT ]

Images, Audio & Embeddings

Beyond chat, SpiderGate exposes OpenAI-compatible endpoints for image generation, text-to-speech, transcription, and embeddings. They're drop-in replacements for the matching openai SDK calls — point at https://spideriq.ai/api/gate/v1 and use the real OpenAI model names.

Important: Multi-modal endpoints do not use task aliases. Pass the actual model name (dall-e-3, tts-1, whisper-1, text-embedding-3-large). They also require an OpenAI key in the vault — if none is registered, the call returns 503 with code no_openai_key. Add one via the contributor invite flow on The Key Vault.

Image generation

POST /api/gate/v1/images/generations — generate images with dall-e-3, dall-e-2, or gpt-image-1.

curl -X POST "https://spideriq.ai/api/gate/v1/images/generations" \
  -H "Authorization: Bearer $SPIDERIQ_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "dall-e-3",
    "prompt": "A minimalist logo of a spider on a dark background",
    "n": 1,
    "size": "1024x1024",
    "quality": "standard"
  }'
  • model (required) — dall-e-3, dall-e-2, or gpt-image-1.

  • prompt (required) — up to 4000 characters.

  • n (default 1) — number of images, 110.

  • size — e.g. 1024x1024, 1024x1792, 1792x1024 (dall-e-3); 256x256, 512x512, 1024x1024 (dall-e-2).

  • quality / stylestandard|hd and vivid|natural (dall-e-3 only).

  • response_formaturl (default) or b64_json.

The response is the OpenAI image object: { "created": ..., "data": [{ "url": ... }] }.

Text-to-speech

POST /api/gate/v1/audio/speech — synthesize speech with tts-1, tts-1-hd, or gpt-4o-mini-tts. The response body is the raw audio bytes (a streaming response), with Content-Type set from the requested format.

curl -X POST "https://spideriq.ai/api/gate/v1/audio/speech" \
  -H "Authorization: Bearer $SPIDERIQ_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tts-1",
    "input": "Welcome to SpiderGate.",
    "voice": "nova",
    "response_format": "mp3"
  }' --output speech.mp3
  • input (required) — text to speak, up to 4096 characters.

  • voice (required) — one of alloy, echo, fable, onyx, nova, shimmer. Any other value returns 400 invalid_voice.

  • response_formatmp3 (default), opus, aac, flac, wav, or pcm. Any other value returns 400 invalid_response_format.

  • speed — playback speed, 0.254.0.

Transcription

POST /api/gate/v1/audio/transcriptions — transcribe audio with whisper-1, gpt-4o-transcribe, or gpt-4o-mini-transcribe. This is a multipart/form-data upload.

curl -X POST "https://spideriq.ai/api/gate/v1/audio/transcriptions" \
  -H "Authorization: Bearer $SPIDERIQ_TOKEN" \
  -F file="@meeting.mp3" \
  -F model="whisper-1" \
  -F response_format="json"
  • file (required) — the audio file. Maximum 25 MB (OpenAI's hard limit); larger files return 413 file_too_large.

  • model (required) — whisper-1, gpt-4o-transcribe, or gpt-4o-mini-transcribe.

  • language — ISO-639-1 code (en, de, …) to improve accuracy.

  • response_formatjson (default), text, srt, verbose_json, or vtt.

  • temperature0.01.0.

Embeddings

POST /api/gate/v1/embeddings — vectorize text with text-embedding-3-large, text-embedding-3-small, or text-embedding-ada-002.

curl -X POST "https://spideriq.ai/api/gate/v1/embeddings" \
  -H "Authorization: Bearer $SPIDERIQ_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-small",
    "input": "SpiderGate routes requests across providers."
  }'
  • input (required) — a single string or an array of strings.

  • dimensions — output vector size, 13072 (supported by the -3 models).

  • encoding_formatfloat (default) or base64.

The response is the OpenAI embeddings object: { "object": "list", "data": [{ "embedding": [...] }], "model": ..., "usage": ... }.

Cost tracking

All four endpoints are metered the same way chat is. Each request is tagged with its kind (image, audio_tts, audio_stt, embedding) and priced per the provider's published rates, so multi-modal spend shows up in Usage and Traces alongside chat.

Next steps

  1. Add an OpenAI key so these endpoints work — The Key Vault.

  2. Track multi-modal spend in Traces.

  3. See every endpoint in the API Reference.