All 103 NVIDIA NIM text models with token prices ($ per 1M), context window and max output, refreshed daily from the official pricing source. Bring your NVIDIA NIM key to flo2 and you pay these exact prices — zero markup — with fallback, racing and per-request cost accounting on top.
| Model | Context | Max out | In $/M | Cached in | Out $/M | Reasoning |
|---|---|---|---|---|---|---|
| 01-ai/yi-large | — | — | Free | — | Free | — |
| abacusai/dracarys-llama-3.1-70b-instruct | — | — | Free | — | Free | — |
| ai21labs/jamba-1.5-large-instruct | — | — | Free | — | Free | — |
| aisingapore/sea-lion-7b-instruct | — | — | Free | — | Free | — |
| baai/bge-m3 | — | — | Free | — | Free | — |
| bigcode/starcoder2-15b | — | — | Free | — | Free | — |
| bytedance/seed-oss-36b-instruct | — | — | Free | — | Free | — |
| databricks/dbrx-instruct | — | — | Free | — | Free | — |
| deepseek-ai/deepseek-coder-6.7b-instruct | — | — | Free | — | Free | — |
| deepseek-ai/deepseek-v4-flash | — | — | Free | — | Free | — |
| deepseek-ai/deepseek-v4-pro | — | — | Free | — | Free | — |
| google/codegemma-1.1-7b | — | — | Free | — | Free | — |
| google/codegemma-7b | — | — | Free | — | Free | — |
| google/deplot | — | — | Free | — | Free | — |
| google/diffusiongemma-26b-a4b-it | — | — | Free | — | Free | — |
| google/gemma-2-2b-it | — | — | Free | — | Free | — |
| google/gemma-2b | — | — | Free | — | Free | — |
| google/gemma-3-12b-it | — | — | Free | — | Free | — |
| google/gemma-3-4b-it | — | — | Free | — | Free | — |
| google/gemma-3n-e2b-it | — | — | Free | — | Free | — |
| google/gemma-3n-e4b-it | — | — | Free | — | Free | — |
| google/gemma-4-31b-it | — | — | Free | — | Free | — |
| google/recurrentgemma-2b | — | — | Free | — | Free | — |
| ibm/granite-3.0-3b-a800m-instruct | — | — | Free | — | Free | — |
| ibm/granite-3.0-8b-instruct | — | — | Free | — | Free | — |
| ibm/granite-34b-code-instruct | — | — | Free | — | Free | — |
| ibm/granite-8b-code-instruct | — | — | Free | — | Free | — |
| meta/codellama-70b | — | — | Free | — | Free | — |
| meta/llama-3.1-70b-instruct | — | — | Free | — | Free | — |
| meta/llama-3.1-8b-instruct | — | — | Free | — | Free | — |
| meta/llama-3.2-11b-vision-instruct | — | — | Free | — | Free | — |
| meta/llama-3.2-1b-instruct | — | — | Free | — | Free | — |
| meta/llama-3.2-3b-instruct | — | — | Free | — | Free | — |
| meta/llama-3.2-90b-vision-instruct | — | — | Free | — | Free | — |
| meta/llama-3.3-70b-instruct | — | — | Free | — | Free | — |
| meta/llama-4-maverick-17b-128e-instruct | — | — | Free | — | Free | — |
| meta/llama-guard-4-12b | — | — | Free | — | Free | — |
| meta/llama2-70b | — | — | Free | — | Free | — |
| microsoft/kosmos-2 | — | — | Free | — | Free | — |
| microsoft/phi-3-vision-128k-instruct | — | — | Free | — | Free | — |
| microsoft/phi-3.5-moe-instruct | — | — | Free | — | Free | — |
| microsoft/phi-4-mini-instruct | — | — | Free | — | Free | — |
| microsoft/phi-4-multimodal-instruct | — | — | Free | — | Free | — |
| minimaxai/minimax-m2.7 | — | — | Free | — | Free | — |
| mistralai/codestral-22b-instruct-v0.1 | — | — | Free | — | Free | — |
| mistralai/ministral-14b-instruct-2512 | — | — | Free | — | Free | — |
| mistralai/mistral-7b-instruct-v0.3 | — | — | Free | — | Free | — |
| mistralai/mistral-large | — | — | Free | — | Free | — |
| mistralai/mistral-large-2-instruct | — | — | Free | — | Free | — |
| mistralai/mistral-large-3-675b-instruct-2512 | — | — | Free | — | Free | — |
| mistralai/mistral-medium-3.5-128b | — | — | Free | — | Free | — |
| mistralai/mistral-nemotron | — | — | Free | — | Free | — |
| mistralai/mistral-small-4-119b-2603 | — | — | Free | — | Free | — |
| mistralai/mixtral-8x22b-v0.1 | — | — | Free | — | Free | — |
| mistralai/mixtral-8x7b-instruct-v0.1 | — | — | Free | — | Free | — |
| moonshotai/kimi-k2.6 | — | — | Free | — | Free | — |
| nv-mistralai/mistral-nemo-12b-instruct | — | — | Free | — | Free | — |
| nvidia/ai-synthetic-video-detector | — | — | Free | — | Free | — |
| nvidia/gliner-pii | — | — | Free | — | Free | — |
| nvidia/ising-calibration-1-35b-a3b | — | — | Free | — | Free | — |
| nvidia/llama-3.1-nemoguard-8b-content-safety | — | — | Free | — | Free | — |
| nvidia/llama-3.1-nemoguard-8b-topic-control | — | — | Free | — | Free | — |
| nvidia/llama-3.1-nemotron-51b-instruct | — | — | Free | — | Free | — |
| nvidia/llama-3.1-nemotron-70b-instruct | — | — | Free | — | Free | — |
| nvidia/llama-3.1-nemotron-nano-8b-v1 | — | — | Free | — | Free | — |
| nvidia/llama-3.1-nemotron-nano-vl-8b-v1 | — | — | Free | — | Free | — |
| nvidia/llama-3.1-nemotron-safety-guard-8b-v3 | — | — | Free | — | Free | — |
| nvidia/llama-3.1-nemotron-ultra-253b-v1 | — | — | Free | — | Free | — |
| nvidia/llama-3.3-nemotron-super-49b-v1 | — | — | Free | — | Free | — |
| nvidia/llama-3.3-nemotron-super-49b-v1.5 | — | — | Free | — | Free | — |
| nvidia/llama3-chatqa-1.5-70b | — | — | Free | — | Free | — |
| nvidia/mistral-nemo-minitron-8b-8k-instruct | — | — | Free | — | Free | — |
| nvidia/nemotron-3-content-safety | — | — | Free | — | Free | — |
| nvidia/nemotron-3-nano-30b-a3b | — | — | Free | — | Free | — |
| nvidia/nemotron-3-nano-omni-30b-a3b-reasoning | — | — | Free | — | Free | — |
| nvidia/nemotron-3-super-120b-a12b | — | — | Free | — | Free | — |
| nvidia/nemotron-3-ultra-550b-a55b | — | — | Free | — | Free | — |
| nvidia/nemotron-3.5-content-safety | — | — | Free | — | Free | — |
| nvidia/nemotron-4-340b-instruct | — | — | Free | — | Free | — |
| nvidia/nemotron-4-340b-reward | — | — | Free | — | Free | — |
| nvidia/nemotron-content-safety-reasoning-4b | — | — | Free | — | Free | — |
| nvidia/nemotron-mini-4b-instruct | — | — | Free | — | Free | — |
| nvidia/nemotron-nano-12b-v2-vl | — | — | Free | — | Free | — |
| nvidia/nemotron-nano-3-30b-a3b | — | — | Free | — | Free | — |
| nvidia/nemotron-parse | — | — | Free | — | Free | — |
| nvidia/neva-22b | — | — | Free | — | Free | — |
| nvidia/nvidia-nemotron-nano-9b-v2 | — | — | Free | — | Free | — |
| openai/gpt-oss-120b | — | — | Free | — | Free | — |
| openai/gpt-oss-20b | — | — | Free | — | Free | — |
| qwen/qwen3-next-80b-a3b-instruct | — | — | Free | — | Free | — |
| qwen/qwen3.5-122b-a10b | — | — | Free | — | Free | — |
| qwen/qwen3.5-397b-a17b | — | — | Free | — | Free | — |
| sarvamai/sarvam-m | — | — | Free | — | Free | — |
| stepfun-ai/step-3.5-flash | — | — | Free | — | Free | — |
| stepfun-ai/step-3.7-flash | — | — | Free | — | Free | — |
| stockmark/stockmark-2-100b-instruct | — | — | Free | — | Free | — |
| upstage/solar-10.7b-instruct | — | — | Free | — | Free | — |
| writer/palmyra-creative-122b | — | — | Free | — | Free | — |
| writer/palmyra-fin-70b-32k | — | — | Free | — | Free | — |
| writer/palmyra-med-70b | — | — | Free | — | Free | — |
| writer/palmyra-med-70b-32k | — | — | Free | — | Free | — |
| z-ai/glm-5.1 | — | — | Free | — | Free | — |
| zyphra/zamba2-7b-instruct | — | — | Free | — | Free | — |
Prices in USD per 1,000,000 tokens, fetched 2026-06-12 from the official NVIDIA NIM pricing source. preview/free tier (build.nvidia.com); production NIM priced per-GPU. Verify before large commitments. Click any column header to sort.
More providers: OpenAI · Anthropic · Google Gemini · xAI Grok · Groq · Cerebras · Mistral · DeepInfra · OpenRouter · or the full cross-provider comparison.