NVIDIA NIM pricing · auto-updated daily

NVIDIA NIM API pricing — June 12, 2026

All 103 NVIDIA NIM text models with token prices ($ per 1M), context window and max output, refreshed daily from the official pricing source. Bring your NVIDIA NIM key to flo2 and you pay these exact prices — zero markup — with fallback, racing and per-request cost accounting on top.

Free models
103
no per-token charge
Models
103
in this table
Biggest context
tok
01-ai/yi-large
Free models
103
no per-token charge
ModelContextMax outIn $/MCached inOut $/MReasoning
01-ai/yi-large Free Free
abacusai/dracarys-llama-3.1-70b-instruct Free Free
ai21labs/jamba-1.5-large-instruct Free Free
aisingapore/sea-lion-7b-instruct Free Free
baai/bge-m3 Free Free
bigcode/starcoder2-15b Free Free
bytedance/seed-oss-36b-instruct Free Free
databricks/dbrx-instruct Free Free
deepseek-ai/deepseek-coder-6.7b-instruct Free Free
deepseek-ai/deepseek-v4-flash Free Free
deepseek-ai/deepseek-v4-pro Free Free
google/codegemma-1.1-7b Free Free
google/codegemma-7b Free Free
google/deplot Free Free
google/diffusiongemma-26b-a4b-it Free Free
google/gemma-2-2b-it Free Free
google/gemma-2b Free Free
google/gemma-3-12b-it Free Free
google/gemma-3-4b-it Free Free
google/gemma-3n-e2b-it Free Free
google/gemma-3n-e4b-it Free Free
google/gemma-4-31b-it Free Free
google/recurrentgemma-2b Free Free
ibm/granite-3.0-3b-a800m-instruct Free Free
ibm/granite-3.0-8b-instruct Free Free
ibm/granite-34b-code-instruct Free Free
ibm/granite-8b-code-instruct Free Free
meta/codellama-70b Free Free
meta/llama-3.1-70b-instruct Free Free
meta/llama-3.1-8b-instruct Free Free
meta/llama-3.2-11b-vision-instruct Free Free
meta/llama-3.2-1b-instruct Free Free
meta/llama-3.2-3b-instruct Free Free
meta/llama-3.2-90b-vision-instruct Free Free
meta/llama-3.3-70b-instruct Free Free
meta/llama-4-maverick-17b-128e-instruct Free Free
meta/llama-guard-4-12b Free Free
meta/llama2-70b Free Free
microsoft/kosmos-2 Free Free
microsoft/phi-3-vision-128k-instruct Free Free
microsoft/phi-3.5-moe-instruct Free Free
microsoft/phi-4-mini-instruct Free Free
microsoft/phi-4-multimodal-instruct Free Free
minimaxai/minimax-m2.7 Free Free
mistralai/codestral-22b-instruct-v0.1 Free Free
mistralai/ministral-14b-instruct-2512 Free Free
mistralai/mistral-7b-instruct-v0.3 Free Free
mistralai/mistral-large Free Free
mistralai/mistral-large-2-instruct Free Free
mistralai/mistral-large-3-675b-instruct-2512 Free Free
mistralai/mistral-medium-3.5-128b Free Free
mistralai/mistral-nemotron Free Free
mistralai/mistral-small-4-119b-2603 Free Free
mistralai/mixtral-8x22b-v0.1 Free Free
mistralai/mixtral-8x7b-instruct-v0.1 Free Free
moonshotai/kimi-k2.6 Free Free
nv-mistralai/mistral-nemo-12b-instruct Free Free
nvidia/ai-synthetic-video-detector Free Free
nvidia/gliner-pii Free Free
nvidia/ising-calibration-1-35b-a3b Free Free
nvidia/llama-3.1-nemoguard-8b-content-safety Free Free
nvidia/llama-3.1-nemoguard-8b-topic-control Free Free
nvidia/llama-3.1-nemotron-51b-instruct Free Free
nvidia/llama-3.1-nemotron-70b-instruct Free Free
nvidia/llama-3.1-nemotron-nano-8b-v1 Free Free
nvidia/llama-3.1-nemotron-nano-vl-8b-v1 Free Free
nvidia/llama-3.1-nemotron-safety-guard-8b-v3 Free Free
nvidia/llama-3.1-nemotron-ultra-253b-v1 Free Free
nvidia/llama-3.3-nemotron-super-49b-v1 Free Free
nvidia/llama-3.3-nemotron-super-49b-v1.5 Free Free
nvidia/llama3-chatqa-1.5-70b Free Free
nvidia/mistral-nemo-minitron-8b-8k-instruct Free Free
nvidia/nemotron-3-content-safety Free Free
nvidia/nemotron-3-nano-30b-a3b Free Free
nvidia/nemotron-3-nano-omni-30b-a3b-reasoning Free Free
nvidia/nemotron-3-super-120b-a12b Free Free
nvidia/nemotron-3-ultra-550b-a55b Free Free
nvidia/nemotron-3.5-content-safety Free Free
nvidia/nemotron-4-340b-instruct Free Free
nvidia/nemotron-4-340b-reward Free Free
nvidia/nemotron-content-safety-reasoning-4b Free Free
nvidia/nemotron-mini-4b-instruct Free Free
nvidia/nemotron-nano-12b-v2-vl Free Free
nvidia/nemotron-nano-3-30b-a3b Free Free
nvidia/nemotron-parse Free Free
nvidia/neva-22b Free Free
nvidia/nvidia-nemotron-nano-9b-v2 Free Free
openai/gpt-oss-120b Free Free
openai/gpt-oss-20b Free Free
qwen/qwen3-next-80b-a3b-instruct Free Free
qwen/qwen3.5-122b-a10b Free Free
qwen/qwen3.5-397b-a17b Free Free
sarvamai/sarvam-m Free Free
stepfun-ai/step-3.5-flash Free Free
stepfun-ai/step-3.7-flash Free Free
stockmark/stockmark-2-100b-instruct Free Free
upstage/solar-10.7b-instruct Free Free
writer/palmyra-creative-122b Free Free
writer/palmyra-fin-70b-32k Free Free
writer/palmyra-med-70b Free Free
writer/palmyra-med-70b-32k Free Free
z-ai/glm-5.1 Free Free
zyphra/zamba2-7b-instruct Free Free

Prices in USD per 1,000,000 tokens, fetched 2026-06-12 from the official NVIDIA NIM pricing source. preview/free tier (build.nvidia.com); production NIM priced per-GPU. Verify before large commitments. Click any column header to sort.

More providers: OpenAI · Anthropic · Google Gemini · xAI Grok · Groq · Cerebras · Mistral · DeepInfra · OpenRouter · or the full cross-provider comparison.

Use NVIDIA NIM through one key — zero markup.
flo2 routes every call to the cheapest, fastest model that clears your bar, with fallback, racing and true cost accounting. Free during Beta.
Get your flo2 key →
© 2026 flo2.com — the zero-markup LLM gateway & router. blog · all providers · flow → to