Chutes

💰 Plan & Pricing

ℹ️

PRO Plan — $20/mo
$100/mo cap, 5000 request quota, 4h rolling cap $8.33. TEE (Trusted Execution Environment) models ensure your data stays private even from Chutes.

🔑 API Key

cpk_58f0e2ffa99445509865077d5d17ce15.d4a26f87e2bb59078182d3a4148a1a81.9lDRkLx59Ffei1icl7FLkQtCasXabezC

🌐 Endpoints

# Inference (OpenAI-compatible)
https://llm.chutes.ai/v1/chat/completions

# Image Generation
POST https://{slug}.chutes.ai/generate

# Embeddings
https://llm.chutes.ai/v1/embeddings

# Management API
https://api.chutes.ai

📦 LLM Models (20 public)

Model	TEE	Hot	Speed	Price ($/1M)
Qwen/Qwen3-32B-TEE	✅	🔥	192 tok/s	$0.08/$0.24
Qwen/Qwen3-235B-A22B-Thinking-2507	❌	🔥	50 tok/s	$0.11/$0.60
Qwen/Qwen3.5-397B-A17B-TEE	✅	🔥	—	$0.39/$2.34
Qwen/Qwen3.6-27B-TEE	✅	🔥	—	$0.50/$2.00
Qwen/Qwen2.5-Coder-32B-Instruct-TEE	✅	🔥	17 tok/s	$0.024/$0.10
deepseek-ai/DeepSeek-V3.2-TEE	✅	🔥	—	$0.28/$0.42
moonshotai/Kimi-K2.5-TEE	✅	🔥	28 tok/s	$0.44/$2.00
moonshotai/Kimi-K2.6-TEE	✅	🔥	38 tok/s	$0.74/$3.50
zai-org/GLM-5-TEE	✅	🔥	—	$0.95/$2.55
zai-org/GLM-5.1-TEE	✅	🔥	—	$1.05/$3.50
zai-org/GLM-5-Turbo	❌	🔥	25 tok/s	$0.49/$1.96
google/gemma-4-31B-turbo-TEE	✅	🔥	20 tok/s	$0.13/$0.38
MiniMaxAI/MiniMax-M2.5-TEE	✅	🔥	—	$0.15/$1.20
unsloth/Mistral-Nemo-Instruct-2407-TEE	✅	🔥	28 tok/s	$0.024/$0.10

🔊 TTS Model

Model	Type	Notes
Kokoro-82M	TTS	Text-to-speech generation

📐 Embedding Model

Model	Type	Notes
Qwen3-Embedding-8B-TEE	Embedding	✅ TEE-protected embeddings

🖼️ Image Generation (4 working)

Model	Slug	Status	Type
FLUX.1-schnell	chutes-flux-1-schnell	✅ WORKS	JPEG
JuggernautXL-Ragnarok	chutes-juggernautxl-ragnarok	✅ WORKS	JPEG
DreamShaper XL 1.0	chutes-lykon-dreamshaper-xl-1-0	✅ WORKS	JPEG
hunyuan-image-3	chutes-hunyuan-image-3	✅ WORKS	PNG

💻 cURL — Chat

curl -X POST https://llm.chutes.ai/v1/chat/completions \
  -H "Authorization: Bearer cpk_58f0e2ffa99445509865077d5d17ce15.d4a26f87e2bb59078182d3a4148a1a81.9lDRkLx59Ffei1icl7FLkQtCasXabezC" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "Qwen/Qwen3-235B-A22B-Thinking-2507",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

💻 cURL — Image Generation

curl -X POST https://chutes-flux-1-schnell.chutes.ai/generate \
  -H "Authorization: Bearer cpk_58f0e2ffa99445509865077d5d17ce15.d4a26f87e2bb59078182d3a4148a1a81.9lDRkLx59Ffei1icl7FLkQtCasXabezC" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "a beautiful sunset over mountains",
    "width": 512,
    "height": 512,
    "num_inference_steps": 4,
    "guidance_scale": 7.5
  }'

🐍 Python Example

from openai import OpenAI

client = OpenAI(
    api_key="cpk_58f0e2ffa99445509865077d5d17ce15.d4a26f87e2bb59078182d3a4148a1a81.9lDRkLx59Ffei1icl7FLkQtCasXabezC",
    base_url="https://llm.chutes.ai/v1"
)

response = client.chat.completions.create(
    model="Qwen/Qwen3-235B-A22B-Thinking-2507",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

⚠️ Pitfalls & Notes

⚠️

Key Format — Key format is cpk_<id>.<user_id_hex>.<secret>. Dots are REQUIRED. Flat hex keys get 401.

⚠️

Authorization Header — MUST use Authorization: Bearer. X-API-Key returns 502/429 on subdomain routes (silently ignored).

🔒

TEE Privacy — TEE models process data in hardware-based trusted execution environments. Even Chutes cannot read your data.

⚠️

Public Models List — /v1/models is public (returns 200 without auth). Don't assume your key works just because models load.

💡

Image Gen — Image gen uses subdomain invocation: POST https://{slug}.chutes.ai/generate with Bearer auth. FLUX, JuggernautXL, DreamShaper, and Hunyuan tested and working.

⚠️

Cold Starts — Some models (4/8 images, TTS, utility) return 502 due to cold starts. LLM models with 🔥 Hot badge are reliable.

🏷️ Categories

Chat Reasoning Coding Image Gen TTS Embedding

🛷 Chutes