๐ชถ Featherless
Massive model catalog โ 16,000+ models but very slow inference
Featherless
Avg Speed
9 tok/s
Models
16K+
Price
Scale ร3
Best For
Obscure Models
๐ฐ Plan & Pricing
Scale ร3 Pricing โ Very Slow
Scale ร3 pricing means costs multiply quickly. Very slow at 9 tok/s with queue times of 30-50s common. Best as last resort for models not available elsewhere.
Scale ร3 pricing means costs multiply quickly. Very slow at 9 tok/s with queue times of 30-50s common. Best as last resort for models not available elsewhere.
๐ API Key
๐ Endpoint
https://api.featherless.ai/v1/chat/completions
๐ฆ Models (16K+ โ Key Models)
| Model | Speed | Category | Notes |
|---|---|---|---|
| llama-4-maverick | โก 10 tok/s | Chat | Llama 4 Maverick |
| deepseek-r1 | โก 8 tok/s | Reasoning | DeepSeek R1 |
| qwen3-235b | โก 9 tok/s | Chat/Coding | Qwen3 235B |
| deepseek-v3 | โก 8 tok/s | Chat | DeepSeek V3 |
| llama-3.1-70b-instruct | โก 10 tok/s | Chat | Llama 3.1 70B |
| llama-3.1-8b-instruct | โก 12 tok/s | Chat | Llama 3.1 8B |
| mistral-7b-instruct | โก 11 tok/s | Chat | Mistral 7B |
| qwen2.5-coder-32b-instruct | โก 9 tok/s | Coding | Qwen Coder |
| various 16K+ models | โก ~9 avg | All | Chat, Coding, Vision, Reasoning, Image Gen, Audio |
โ ๏ธ Very slow (9 tok/s avg, 30-50s queue). Best as fallback for obscure models. Categories: Chat, Coding, Vision, Reasoning, Image Gen, Audio.
๐ป cURL Example
curl -X POST https://api.featherless.ai/v1/chat/completions \
-H "Authorization: Bearer rc_96ff...c073b" \
-H "Content-Type: application/json" \
-d '{
"model": "meta-llama/llama-3-70b-instruct",
"messages": [{"role": "user", "content": "Hello!"}]
}'
๐ Python Example
from openai import OpenAI
client = OpenAI(
api_key="rc_96ff...c073b",
base_url="https://api.featherless.ai/v1"
)
response = client.chat.completions.create(
model="meta-llama/llama-3-70b-instruct",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
โ ๏ธ Pitfalls & Notes
Very Slow โ 9 tok/s avg with 30-50 second queue times. Expect long waits before responses begin.
Scale ร3 Pricing โ Costs multiply quickly. Scale ร3 pricing means you pay 3ร the base model cost.
Largest Model Catalog โ 16,000+ models is the largest catalog of any provider.
Fallback for Obscure Models โ Useful as fallback for obscure/niche models not available elsewhere.
๐ท๏ธ Categories
Chat
Long-tail Models
Research