๐Ÿ›ท

Chutes

Speed: โšก 28 tok/s avg
Models: 35 public
Price: $20/mo PRO
Status: โœ… Online
Avg Speed
28 tok/s
Models
35
Price
$20/mo
Best For
TEE Privacy, Qwen3-235B

๐Ÿ’ฐ Plan & Pricing

โ„น๏ธ
PRO Plan โ€” $20/mo
$100/mo cap, 5000 request quota, 4h rolling cap $8.33. TEE (Trusted Execution Environment) models ensure your data stays private even from Chutes.

๐Ÿ”‘ API Key

cpk_58f0e2ffa99445509865077d5d17ce15.d4a26f87e2bb59078182d3a4148a1a81.9lDRkLx59Ffei1icl7FLkQtCasXabezC

๐ŸŒ Endpoints

# Inference (OpenAI-compatible)
https://llm.chutes.ai/v1/chat/completions

# Image Generation
POST https://{slug}.chutes.ai/generate

# Embeddings
https://llm.chutes.ai/v1/embeddings

# Management API
https://api.chutes.ai

๐Ÿ“ฆ LLM Models (20 public)

ModelTEEHotSpeedPrice ($/1M)
Qwen/Qwen3-32B-TEEโœ…๐Ÿ”ฅ192 tok/s$0.08/$0.24
Qwen/Qwen3-235B-A22B-Thinking-2507โŒ๐Ÿ”ฅ50 tok/s$0.11/$0.60
Qwen/Qwen3.5-397B-A17B-TEEโœ…๐Ÿ”ฅโ€”$0.39/$2.34
Qwen/Qwen3.6-27B-TEEโœ…๐Ÿ”ฅโ€”$0.50/$2.00
Qwen/Qwen2.5-Coder-32B-Instruct-TEEโœ…๐Ÿ”ฅ17 tok/s$0.024/$0.10
deepseek-ai/DeepSeek-V3.2-TEEโœ…๐Ÿ”ฅโ€”$0.28/$0.42
moonshotai/Kimi-K2.5-TEEโœ…๐Ÿ”ฅ28 tok/s$0.44/$2.00
moonshotai/Kimi-K2.6-TEEโœ…๐Ÿ”ฅ38 tok/s$0.74/$3.50
zai-org/GLM-5-TEEโœ…๐Ÿ”ฅโ€”$0.95/$2.55
zai-org/GLM-5.1-TEEโœ…๐Ÿ”ฅโ€”$1.05/$3.50
zai-org/GLM-5-TurboโŒ๐Ÿ”ฅ25 tok/s$0.49/$1.96
google/gemma-4-31B-turbo-TEEโœ…๐Ÿ”ฅ20 tok/s$0.13/$0.38
MiniMaxAI/MiniMax-M2.5-TEEโœ…๐Ÿ”ฅโ€”$0.15/$1.20
unsloth/Mistral-Nemo-Instruct-2407-TEEโœ…๐Ÿ”ฅ28 tok/s$0.024/$0.10

๐Ÿ”Š TTS Model

ModelTypeNotes
Kokoro-82MTTSText-to-speech generation

๐Ÿ“ Embedding Model

ModelTypeNotes
Qwen3-Embedding-8B-TEEEmbeddingโœ… TEE-protected embeddings

๐Ÿ–ผ๏ธ Image Generation (4 working)

ModelSlugStatusType
FLUX.1-schnellchutes-flux-1-schnellโœ… WORKSJPEG
JuggernautXL-Ragnarokchutes-juggernautxl-ragnarokโœ… WORKSJPEG
DreamShaper XL 1.0chutes-lykon-dreamshaper-xl-1-0โœ… WORKSJPEG
hunyuan-image-3chutes-hunyuan-image-3โœ… WORKSPNG

๐Ÿ’ป cURL โ€” Chat

curl -X POST https://llm.chutes.ai/v1/chat/completions \
  -H "Authorization: Bearer cpk_58f0e2ffa99445509865077d5d17ce15.d4a26f87e2bb59078182d3a4148a1a81.9lDRkLx59Ffei1icl7FLkQtCasXabezC" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen/Qwen3-235B-A22B-Thinking-2507",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

๐Ÿ’ป cURL โ€” Image Generation

curl -X POST https://chutes-flux-1-schnell.chutes.ai/generate \
  -H "Authorization: Bearer cpk_58f0e2ffa99445509865077d5d17ce15.d4a26f87e2bb59078182d3a4148a1a81.9lDRkLx59Ffei1icl7FLkQtCasXabezC" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "a beautiful sunset over mountains",
    "width": 512,
    "height": 512,
    "num_inference_steps": 4,
    "guidance_scale": 7.5
  }'

๐Ÿ Python Example

from openai import OpenAI

client = OpenAI(
    api_key="cpk_58f0e2ffa99445509865077d5d17ce15.d4a26f87e2bb59078182d3a4148a1a81.9lDRkLx59Ffei1icl7FLkQtCasXabezC",
    base_url="https://llm.chutes.ai/v1"
)

response = client.chat.completions.create(
    model="Qwen/Qwen3-235B-A22B-Thinking-2507",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

โš ๏ธ Pitfalls & Notes

โš ๏ธ
Key Format โ€” Key format is cpk_<id>.<user_id_hex>.<secret>. Dots are REQUIRED. Flat hex keys get 401.
โš ๏ธ
Authorization Header โ€” MUST use Authorization: Bearer. X-API-Key returns 502/429 on subdomain routes (silently ignored).
๐Ÿ”’
TEE Privacy โ€” TEE models process data in hardware-based trusted execution environments. Even Chutes cannot read your data.
โš ๏ธ
Public Models List โ€” /v1/models is public (returns 200 without auth). Don't assume your key works just because models load.
๐Ÿ’ก
Image Gen โ€” Image gen uses subdomain invocation: POST https://{slug}.chutes.ai/generate with Bearer auth. FLUX, JuggernautXL, DreamShaper, and Hunyuan tested and working.
โš ๏ธ
Cold Starts โ€” Some models (4/8 images, TTS, utility) return 502 due to cold starts. LLM models with ๐Ÿ”ฅ Hot badge are reliable.

๐Ÿท๏ธ Categories

Chat Reasoning Coding Image Gen TTS Embedding