🪶

Featherless

Speed: ⚡ 9 tok/s avg

Models: 16K+

Price: Scale ×3

Status: ⚠️ Very Slow

Avg Speed

9 tok/s

Models

16K+

Price

Scale ×3

Best For

Obscure Models

💰 Plan & Pricing

⚠️

Scale ×3 Pricing — Very Slow
Scale ×3 pricing means costs multiply quickly. Very slow at 9 tok/s with queue times of 30-50s common. Best as last resort for models not available elsewhere.

🔑 API Key

rc_96ff...c073b

🌐 Endpoint

https://api.featherless.ai/v1/chat/completions

📦 Models (16K+ — Key Models)

Model	Speed	Category	Notes
llama-4-maverick	⚡ 10 tok/s	Chat	Llama 4 Maverick
deepseek-r1	⚡ 8 tok/s	Reasoning	DeepSeek R1
qwen3-235b	⚡ 9 tok/s	Chat/Coding	Qwen3 235B
deepseek-v3	⚡ 8 tok/s	Chat	DeepSeek V3
llama-3.1-70b-instruct	⚡ 10 tok/s	Chat	Llama 3.1 70B
llama-3.1-8b-instruct	⚡ 12 tok/s	Chat	Llama 3.1 8B
mistral-7b-instruct	⚡ 11 tok/s	Chat	Mistral 7B
qwen2.5-coder-32b-instruct	⚡ 9 tok/s	Coding	Qwen Coder
various 16K+ models	⚡ ~9 avg	All	Chat, Coding, Vision, Reasoning, Image Gen, Audio

⚠️ Very slow (9 tok/s avg, 30-50s queue). Best as fallback for obscure models. Categories: Chat, Coding, Vision, Reasoning, Image Gen, Audio.

💻 cURL Example

curl -X POST https://api.featherless.ai/v1/chat/completions \
  -H "Authorization: Bearer rc_96ff...c073b" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta-llama/llama-3-70b-instruct",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

🐍 Python Example

from openai import OpenAI

client = OpenAI(
    api_key="rc_96ff...c073b",
    base_url="https://api.featherless.ai/v1"
)

response = client.chat.completions.create(
    model="meta-llama/llama-3-70b-instruct",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

⚠️ Pitfalls & Notes

🚨

Very Slow — 9 tok/s avg with 30-50 second queue times. Expect long waits before responses begin.

⚠️

Scale ×3 Pricing — Costs multiply quickly. Scale ×3 pricing means you pay 3× the base model cost.

ℹ️

Largest Model Catalog — 16,000+ models is the largest catalog of any provider.

ℹ️

Fallback for Obscure Models — Useful as fallback for obscure/niche models not available elsewhere.

🏷️ Categories

Chat Long-tail Models Research