๐Ÿฆ™

Ollama Cloud

Speed: โšก 51 tok/s avg
Models: 39
Price: Unlimited
Status: โœ… Online
Avg Speed
51 tok/s
Models
39
Price
Unlimited
Best For
DS-V4-Pro, Exclusives

๐Ÿ’ฐ Plan & Pricing

โœ…
Unlimited Access
39 models with no rate limits. Both native Ollama API and OpenAI-compatible API supported. Many exclusive models not available elsewhere.

๐Ÿ”‘ API Key

57eeb6f594fd4cd5a3ee58f7f280213e.5Ct3aDAxw-cuA0pUkUB67ggh

๐ŸŒ Endpoints

# OpenAI-compatible (recommended)
https://ollama.com/v1/chat/completions

# Native Ollama API
https://ollama.com/api/chat

๐Ÿ“ฆ Models (39 total)

ModelSpeedExclusive?Notes
nemotron-3-superโšก 102 tok/sโœ… ExclusiveBest for this model
gpt-oss:20bโšก 107 tok/sNoAlso on Groq
minimax-m2.1โšก 83 tok/sโœ… ExclusiveOnly here
ministral-3:3bโšก 83 tok/sโœ… Exclusive
qwen3-next:80bโšก 82 tok/sโœ… ExclusiveQwen3 variant
gemma3:4bโšก 75 tok/sNo
nemotron-3-nano:30bโšก 74 tok/sโœ… Exclusive
qwen3.5:397bโšก 72 tok/sโœ… ExclusiveMassive Qwen3.5
gpt-oss:120bโšก 71 tok/sNoAlso Groq
rnj-1:8bโšก 68 tok/sโœ… Exclusive
glm-5.1โšก 63 tok/sNo2nd best for this
kimi-k2.5โšก 58 tok/sNo
gemma3:12bโšก 56 tok/sNo
ministral-3:8bโšก 56 tok/sNo
glm-4.6โšก 55 tok/sโœ… BestBest for this model
qwen3-coder-nextโšก 52 tok/sโœ… Exclusive
glm-4.7โšก 51 tok/sโœ… BestBest for this model
deepseek-v4-proโšก 50 tok/sNo๐Ÿฅ‡ Best DS-V4-Pro
ministral-3:14bโšก 49 tok/sNo
gemini-3-flash-previewโšก 47 tok/sโœ… Exclusive
cogito-2.1:671bโšก 45 tok/sโœ… Exclusive
deepseek-v4-flashโšก 44 tok/sNo
minimax-m2โšก 43 tok/sNo
gemma-3-27bโšก 39 tok/sNoGemma 3 27B
devstral-small-2:24bโšก 37 tok/sโœ… Exclusive
devstral-2:123bโšก 36 tok/sโœ… Exclusive
llama-4-maverickโšก 35 tok/sNoLlama 4 Maverick
llama-4-scoutโšก 38 tok/sNoLlama 4 Scout
deepseek-r1-0528โšก 32 tok/sNoDeepSeek R1
glm-5โšก 32 tok/sNo
gemma4:31bโšก 31 tok/sNo
qwen3-vl:235b-instructโšก 31 tok/sโœ… Exclusive
qwen3-coder:480bโšก 25 tok/sโœ… ExclusiveMassive coder
llama-4-behemothโšก 22 tok/sNoLlama 4 Behemoth
deepseek-v3.2โšก 23 tok/sNo
qwen3-vl:235bโšก 20 tok/sโœ… Exclusive
minimax-m2.7โšก 18 tok/sNo
kimi-k2.6โšก 18 tok/sNo
mistral-large-3:675b๐Ÿข 11 tok/sโœ… Exclusive
kimi-k2:1t๐Ÿข 11 tok/sโœ… Exclusive1T parameter model
deepseek-v3.1:671b๐Ÿข 10 tok/sโœ… Exclusive
minimax-m2.5๐Ÿข 4 tok/sNoBest on OpenCode

๐Ÿ’ป cURL Example

curl -X POST https://ollama.com/v1/chat/completions \
  -H "Authorization: Bearer 57eeb6f594fd4cd5a3ee58f7f280213e.5Ct3aDAxw-cuA0pUkUB67ggh" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-v4-pro",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

๐Ÿ Python Example

from openai import OpenAI

client = OpenAI(
    api_key="57eeb6f594fd4cd5a3ee58f7f280213e.5Ct3aDAxw-cuA0pUkUB67ggh",
    base_url="https://ollama.com/v1"
)

response = client.chat.completions.create(
    model="deepseek-v4-pro",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

โš ๏ธ Pitfalls & Notes

๐Ÿ’ก
Dual API Support โ€” Ollama Cloud supports both native Ollama API (/api/chat) and OpenAI-compatible API (/v1/chat/completions). Use the OpenAI-compatible endpoint for standard SDK compatibility.
โš ๏ธ
Model IDs with Colons โ€” Some model IDs use colons (e.g., gpt-oss:120b, gemma3:4b). Make sure your SDK handles these correctly.
๐Ÿ’ก
17 Exclusive Models โ€” Ollama Cloud has 17 exclusive models not available on other providers, including nemotron-3-super, qwen3.5:397b, cogito-2.1:671b, and qwen3-coder:480b.

๐Ÿท๏ธ Categories

Chat Coding Vision Audio