Groq pricing · auto-updated daily

Groq API pricing — June 12, 2026

All 8 Groq text models with token prices ($ per 1M), context window and max output, refreshed daily from the official pricing source. Bring your Groq key to flo2 and you pay these exact prices — zero markup — with fallback, racing and per-request cost accounting on top.

Cheapest output
$0.08/M
llama-3.1-8b-instant
Cheapest input
$0.05/M
llama-3.1-8b-instant
Biggest context
131K tok
qwen/qwen3-32b
Models
8
tracked daily
ModelContextMax outIn $/MCached inOut $/MReasoning
moonshotai/kimi-k2-instruct-0905 $1 $0.5 $3
llama-3.3-70b-versatile 128K $0.59 $0.79
openai/gpt-oss-120b 128K $0.15 $0.075 $0.6
qwen/qwen3-32b 131K $0.29 $0.59
meta-llama/llama-4-scout-17b-16e-instruct 128K $0.11 $0.34
openai/gpt-oss-20b 128K $0.075 $0.0375 $0.3
openai/gpt-oss-safeguard-20b $0.075 $0.3
llama-3.1-8b-instant 128K $0.05 $0.08

Prices in USD per 1,000,000 tokens, fetched 2026-06-12 from the official Groq pricing source. Verify before large commitments. Click any column header to sort.

More providers: OpenAI · Anthropic · Google Gemini · xAI Grok · Cerebras · Mistral · DeepInfra · OpenRouter · NVIDIA NIM · or the full cross-provider comparison.

Use Groq through one key — zero markup.
flo2 routes every call to the cheapest, fastest model that clears your bar, with fallback, racing and true cost accounting. Free during Beta.
Get your flo2 key →
© 2026 flo2.com — the zero-markup LLM gateway & router. blog · all providers · flow → to