NVIDIA NIM pricing · auto-updated daily

NVIDIA NIM API pricing — June 12, 2026

Name: NVIDIA NIM API pricing (June 2026)
Creator: flo2
License: https://creativecommons.org/licenses/by/4.0/

All 103 NVIDIA NIM text models with token prices ($ per 1M), context window and max output, refreshed daily from the official pricing source. Bring your NVIDIA NIM key to flo2 and you pay these exact prices — zero markup — with fallback, racing and per-request cost accounting on top.

Free models

103

no per-token charge

Models

103

in this table

Biggest context

— tok

01-ai/yi-large

Free models

103

no per-token charge

Model	Context	Max out	In $/M	Cached in	Out $/M	Reasoning
01-ai/yi-large	—	—	Free	—	Free	—
abacusai/dracarys-llama-3.1-70b-instruct	—	—	Free	—	Free	—
ai21labs/jamba-1.5-large-instruct	—	—	Free	—	Free	—
aisingapore/sea-lion-7b-instruct	—	—	Free	—	Free	—
baai/bge-m3	—	—	Free	—	Free	—
bigcode/starcoder2-15b	—	—	Free	—	Free	—
bytedance/seed-oss-36b-instruct	—	—	Free	—	Free	—
databricks/dbrx-instruct	—	—	Free	—	Free	—
deepseek-ai/deepseek-coder-6.7b-instruct	—	—	Free	—	Free	—
deepseek-ai/deepseek-v4-flash	—	—	Free	—	Free	—
deepseek-ai/deepseek-v4-pro	—	—	Free	—	Free	—
google/codegemma-1.1-7b	—	—	Free	—	Free	—
google/codegemma-7b	—	—	Free	—	Free	—
google/deplot	—	—	Free	—	Free	—
google/diffusiongemma-26b-a4b-it	—	—	Free	—	Free	—
google/gemma-2-2b-it	—	—	Free	—	Free	—
google/gemma-2b	—	—	Free	—	Free	—
google/gemma-3-12b-it	—	—	Free	—	Free	—
google/gemma-3-4b-it	—	—	Free	—	Free	—
google/gemma-3n-e2b-it	—	—	Free	—	Free	—
google/gemma-3n-e4b-it	—	—	Free	—	Free	—
google/gemma-4-31b-it	—	—	Free	—	Free	—
google/recurrentgemma-2b	—	—	Free	—	Free	—
ibm/granite-3.0-3b-a800m-instruct	—	—	Free	—	Free	—
ibm/granite-3.0-8b-instruct	—	—	Free	—	Free	—
ibm/granite-34b-code-instruct	—	—	Free	—	Free	—
ibm/granite-8b-code-instruct	—	—	Free	—	Free	—
meta/codellama-70b	—	—	Free	—	Free	—
meta/llama-3.1-70b-instruct	—	—	Free	—	Free	—
meta/llama-3.1-8b-instruct	—	—	Free	—	Free	—
meta/llama-3.2-11b-vision-instruct	—	—	Free	—	Free	—
meta/llama-3.2-1b-instruct	—	—	Free	—	Free	—
meta/llama-3.2-3b-instruct	—	—	Free	—	Free	—
meta/llama-3.2-90b-vision-instruct	—	—	Free	—	Free	—
meta/llama-3.3-70b-instruct	—	—	Free	—	Free	—
meta/llama-4-maverick-17b-128e-instruct	—	—	Free	—	Free	—
meta/llama-guard-4-12b	—	—	Free	—	Free	—
meta/llama2-70b	—	—	Free	—	Free	—
microsoft/kosmos-2	—	—	Free	—	Free	—
microsoft/phi-3-vision-128k-instruct	—	—	Free	—	Free	—
microsoft/phi-3.5-moe-instruct	—	—	Free	—	Free	—
microsoft/phi-4-mini-instruct	—	—	Free	—	Free	—
microsoft/phi-4-multimodal-instruct	—	—	Free	—	Free	—
minimaxai/minimax-m2.7	—	—	Free	—	Free	—
mistralai/codestral-22b-instruct-v0.1	—	—	Free	—	Free	—
mistralai/ministral-14b-instruct-2512	—	—	Free	—	Free	—
mistralai/mistral-7b-instruct-v0.3	—	—	Free	—	Free	—
mistralai/mistral-large	—	—	Free	—	Free	—
mistralai/mistral-large-2-instruct	—	—	Free	—	Free	—
mistralai/mistral-large-3-675b-instruct-2512	—	—	Free	—	Free	—
mistralai/mistral-medium-3.5-128b	—	—	Free	—	Free	—
mistralai/mistral-nemotron	—	—	Free	—	Free	—
mistralai/mistral-small-4-119b-2603	—	—	Free	—	Free	—
mistralai/mixtral-8x22b-v0.1	—	—	Free	—	Free	—
mistralai/mixtral-8x7b-instruct-v0.1	—	—	Free	—	Free	—
moonshotai/kimi-k2.6	—	—	Free	—	Free	—
nv-mistralai/mistral-nemo-12b-instruct	—	—	Free	—	Free	—
nvidia/ai-synthetic-video-detector	—	—	Free	—	Free	—
nvidia/gliner-pii	—	—	Free	—	Free	—
nvidia/ising-calibration-1-35b-a3b	—	—	Free	—	Free	—
nvidia/llama-3.1-nemoguard-8b-content-safety	—	—	Free	—	Free	—
nvidia/llama-3.1-nemoguard-8b-topic-control	—	—	Free	—	Free	—
nvidia/llama-3.1-nemotron-51b-instruct	—	—	Free	—	Free	—
nvidia/llama-3.1-nemotron-70b-instruct	—	—	Free	—	Free	—
nvidia/llama-3.1-nemotron-nano-8b-v1	—	—	Free	—	Free	—
nvidia/llama-3.1-nemotron-nano-vl-8b-v1	—	—	Free	—	Free	—
nvidia/llama-3.1-nemotron-safety-guard-8b-v3	—	—	Free	—	Free	—
nvidia/llama-3.1-nemotron-ultra-253b-v1	—	—	Free	—	Free	—
nvidia/llama-3.3-nemotron-super-49b-v1	—	—	Free	—	Free	—
nvidia/llama-3.3-nemotron-super-49b-v1.5	—	—	Free	—	Free	—
nvidia/llama3-chatqa-1.5-70b	—	—	Free	—	Free	—
nvidia/mistral-nemo-minitron-8b-8k-instruct	—	—	Free	—	Free	—
nvidia/nemotron-3-content-safety	—	—	Free	—	Free	—
nvidia/nemotron-3-nano-30b-a3b	—	—	Free	—	Free	—
nvidia/nemotron-3-nano-omni-30b-a3b-reasoning	—	—	Free	—	Free	—
nvidia/nemotron-3-super-120b-a12b	—	—	Free	—	Free	—
nvidia/nemotron-3-ultra-550b-a55b	—	—	Free	—	Free	—
nvidia/nemotron-3.5-content-safety	—	—	Free	—	Free	—
nvidia/nemotron-4-340b-instruct	—	—	Free	—	Free	—
nvidia/nemotron-4-340b-reward	—	—	Free	—	Free	—
nvidia/nemotron-content-safety-reasoning-4b	—	—	Free	—	Free	—
nvidia/nemotron-mini-4b-instruct	—	—	Free	—	Free	—
nvidia/nemotron-nano-12b-v2-vl	—	—	Free	—	Free	—
nvidia/nemotron-nano-3-30b-a3b	—	—	Free	—	Free	—
nvidia/nemotron-parse	—	—	Free	—	Free	—
nvidia/neva-22b	—	—	Free	—	Free	—
nvidia/nvidia-nemotron-nano-9b-v2	—	—	Free	—	Free	—
openai/gpt-oss-120b	—	—	Free	—	Free	—
openai/gpt-oss-20b	—	—	Free	—	Free	—
qwen/qwen3-next-80b-a3b-instruct	—	—	Free	—	Free	—
qwen/qwen3.5-122b-a10b	—	—	Free	—	Free	—
qwen/qwen3.5-397b-a17b	—	—	Free	—	Free	—
sarvamai/sarvam-m	—	—	Free	—	Free	—
stepfun-ai/step-3.5-flash	—	—	Free	—	Free	—
stepfun-ai/step-3.7-flash	—	—	Free	—	Free	—
stockmark/stockmark-2-100b-instruct	—	—	Free	—	Free	—
upstage/solar-10.7b-instruct	—	—	Free	—	Free	—
writer/palmyra-creative-122b	—	—	Free	—	Free	—
writer/palmyra-fin-70b-32k	—	—	Free	—	Free	—
writer/palmyra-med-70b	—	—	Free	—	Free	—
writer/palmyra-med-70b-32k	—	—	Free	—	Free	—
z-ai/glm-5.1	—	—	Free	—	Free	—
zyphra/zamba2-7b-instruct	—	—	Free	—	Free	—

Prices in USD per 1,000,000 tokens, fetched 2026-06-12 from the official NVIDIA NIM pricing source. preview/free tier (build.nvidia.com); production NIM priced per-GPU. Verify before large commitments. Click any column header to sort.

More providers: OpenAI · Anthropic · Google Gemini · xAI Grok · Groq · Cerebras · Mistral · DeepInfra · OpenRouter · or the full cross-provider comparison.

Use NVIDIA NIM through one key — zero markup.

flo2 routes every call to the cheapest, fastest model that clears your bar, with fallback, racing and true cost accounting. Free during Beta.

Get your flo2 key →