๐Ÿ”ฎ

Infermatic

Speed: โšก 38 tok/s avg
Models: 19
Price: $20/mo flat
Status: โœ… Online
Avg Speed
38 tok/s
Models
19
Price
$20/mo flat
Best For
TTS & Embeddings

๐Ÿ’ฐ Plan & Pricing

โ„น๏ธ
$20/Month Flat Rate
Unlimited requests. Includes TTS (Kokoro-82M) and embeddings (multilingual-e5-base) โ€” no surprise bills.

๐Ÿ”‘ API Key

sk-TZv...Q5Jp

๐ŸŒ Endpoint

https://api.totalgpt.ai/v1/chat/completions

๐Ÿ“ฆ Models (19 total โ€” Chat, Coding, TTS, Embeddings)

ModelSpeedCategoryNotes
gemini-2.5-flashโšก 48 tok/sChatFast Gemini
gpt-4.1-miniโšก 52 tok/sChatGPT-4.1 Mini โ€” fast & cheap
gpt-4.1โšก 40 tok/sChatGPT-4.1
gpt-4oโšก 38 tok/sChatGPT-4o
claude-sonnet-4-5โšก 38 tok/sChatClaude Sonnet 4.5
claude-3.5-sonnetโšก 36 tok/sChatClaude 3.5 Sonnet
deepseek-r1โšก 35 tok/sReasoningDeepSeek R1
qwen3-235bโšก 34 tok/sChat/CodingQwen3 235B
deepseek-v3โšก 37 tok/sChatDeepSeek V3
llama-4-maverickโšก 36 tok/sChatLlama 4 Maverick
deepseek-v4โšก 33 tok/sChatDeepSeek V4
llama-3.3-70b-versatileโšก 35 tok/sChatLlama 3.3 70B
mixtral-8x7b-32768โšก 30 tok/sChatMixtral MoE
gemma-3-27bโšก 32 tok/sChatGemma 3 27B
qwen3-32bโšก 38 tok/sChatQwen3 32B
glm-5โšก 30 tok/sChatGLM-5
moonshot-visionโšก 25 tok/sVisionVision model
kokoro-82m๐Ÿ”Š N/ATTS๐Ÿ”Š Kokoro text-to-speech
chat-tts๐Ÿ”Š N/ATTS๐Ÿ”Š ChatTTS voice synthesis
multilingual-e5-base๐Ÿ“ N/AEmbedding๐Ÿ“ Multilingual embeddings

๐Ÿ’ป cURL Example

curl -X POST https://api.totalgpt.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-TZv...Q5Jp" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-5",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

๐Ÿ Python Example

from openai import OpenAI

client = OpenAI(
    api_key="sk-TZv...Q5Jp",
    base_url="https://api.totalgpt.ai/v1"
)

response = client.chat.completions.create(
    model="claude-sonnet-4-5",
    messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)

โš ๏ธ Pitfalls & Notes

โ„น๏ธ
Includes TTS (Kokoro-82M) โ€” Rare among providers. Text-to-speech available at no extra cost.
โ„น๏ธ
Includes Embeddings (multilingual-e5-base) โ€” Embedding models included in the flat rate, great for RAG pipelines.
โ„น๏ธ
Flat Rate = No Surprise Bills โ€” $20/mo unlimited requests means predictable costs regardless of usage.

๐Ÿท๏ธ Categories

Chat Coding TTS Embeddings