2026-06-03 · flo2 blog

Use Mistral with the OpenAI SDK: Compatible API & Base URL

Mistral's API is largely OpenAI-compatible at the wire level: point the standard openai Python or JavaScript client at https://api.mistral.ai/v1, swap in your Mistral key and a Mistral model name, and your existing Chat Completions code runs without further changes. This matters because it means migrating an app from OpenAI to Mistral — or running Mistral alongside other providers — is a base-URL and key change, not a rewrite. This guide walks through the mistral openai compatible endpoint in detail: curl and Python examples, what the compatibility covers, streaming, gotchas, and how to route Mistral through a gateway for fallback and cost control. Verify current model IDs and any feature details in the Mistral documentation — model names and supported parameters evolve quickly.

Mistral's OpenAI-compatible base URL

Mistral exposes a Chat Completions endpoint that follows the OpenAI wire format. The base URL is:

https://api.mistral.ai/v1

Authentication is a standard bearer token — your Mistral API key from La Plateforme, sent in the Authorization: Bearer <key> header. This is the same header the OpenAI SDK sends by default, which is exactly why the compatibility works with no client-side changes.

Model IDs are Mistral-specific. Mistral publishes a range of models — flagship large models, efficient small models, and code-focused models like Codestral and Devstral. Always confirm the exact model identifiers in the Mistral models overview before committing a model string to your codebase; the catalog and version suffixes change as new releases land. The Mistral API guide covers model families and API key setup in more depth.

curl: a minimal request to the Mistral chat completions endpoint

Before wiring anything into application code, a raw curl call is the fastest way to confirm your key and base URL are working:

export MISTRAL_API_KEY="your_mistral_key"

curl https://api.mistral.ai/v1/chat/completions \
  -H "Authorization: Bearer $MISTRAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mistral-small-latest",
    "messages": [
      {"role": "system", "content": "You are a concise technical assistant."},
      {"role": "user",   "content": "What is mixture-of-experts architecture?"}
    ]
  }'

The response is the standard OpenAI shape: a choices array, a message.content string, a finish_reason, and a usage object with prompt_tokens, completion_tokens, and total_tokens. If the JSON comes back cleanly, the endpoint and key are good. Substitute the model string with a current ID from the Mistral docs — mistral-small-latest is used here as an example; verify it is still valid before shipping.

Use Mistral with the OpenAI Python SDK

The OpenAI Python client takes base_url and api_key as constructor arguments. Point both at Mistral and everything downstream — message building, response parsing, tool-call handling — stays identical:

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://api.mistral.ai/v1",
    api_key=os.environ["MISTRAL_API_KEY"],
)

resp = client.chat.completions.create(
    model="mistral-small-latest",   # verify current model IDs in Mistral docs
    messages=[
        {"role": "system", "content": "Reply in one sentence."},
        {"role": "user",   "content": "Why do developers choose European LLM providers?"},
    ],
)

print(resp.choices[0].message.content)
print(resp.usage)   # prompt_tokens, completion_tokens, total_tokens

Any framework that accepts an OpenAI base_url override works the same way: LangChain, LlamaIndex, instructor, the Vercel AI SDK. They all construct the same HTTP request underneath, so pointing them at Mistral is the same two-argument change.

Streaming with the Mistral OpenAI-compatible endpoint

Mistral supports streaming responses on the compatible endpoint. Set stream=True and iterate exactly as you would against the OpenAI API:

stream = client.chat.completions.create(
    model="mistral-small-latest",
    messages=[
        {"role": "user", "content": "Explain tokenization to a junior developer."}
    ],
    stream=True,
)

for chunk in stream:
    delta = chunk.choices[0].delta.content
    if delta:
        print(delta, end="", flush=True)

The wire protocol is server-sent events with data: lines terminated by data: [DONE] — identical to OpenAI. Existing streaming parsers work without changes. To capture token counts from a stream, check whether Mistral supports stream_options={"include_usage": True} for the model you are using — that OpenAI-compatible parameter appends a usage block to the final chunk. Verify availability in the Mistral docs.

JavaScript / TypeScript

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.mistral.ai/v1",
  apiKey:  process.env.MISTRAL_API_KEY,
});

const resp = await client.chat.completions.create({
  model: "mistral-small-latest",   // verify in Mistral docs
  messages: [{ role: "user", content: "List three open-weight Mistral models." }],
});

console.log(resp.choices[0].message.content);

What Mistral's compatibility layer covers — and what it doesn't

Mistral's compatible endpoint targets the Chat Completions surface. Here is what to expect:

Migrating an existing OpenAI app to Mistral

For an app that uses core chat completions, the migration is three environment variable changes and nothing else in application code:

# Before (OpenAI)
OPENAI_API_KEY="sk-..."
# base_url defaults to https://api.openai.com/v1
# model: "gpt-4o"

# After (Mistral)
MISTRAL_API_KEY="..."         # your Mistral key from La Plateforme
MISTRAL_BASE_URL="https://api.mistral.ai/v1"
# model: "mistral-large-latest"  — verify current ID in Mistral docs

In code, if you already externalize model strings (which you should), the change is:

client = OpenAI(
    base_url=os.environ.get("OPENAI_BASE_URL", "https://api.openai.com/v1"),
    api_key=os.environ["OPENAI_API_KEY"],
)
model = os.environ.get("OPENAI_MODEL", "gpt-4o")

Set OPENAI_BASE_URL=https://api.mistral.ai/v1, OPENAI_API_KEY to your Mistral key, and OPENAI_MODEL to a Mistral model ID. Application code is untouched. This pattern also makes it trivial to A/B test Mistral against your current provider: run both configurations in parallel and compare quality, latency, and cost.

Gotchas when migrating to Mistral

Routing Mistral behind a gateway

Pointing the OpenAI SDK at api.mistral.ai/v1 is the right first step. The limitation is that it hard-codes a single provider: when Mistral rate-limits you, a specific model is at capacity, or you want to benchmark Mistral against another provider on live traffic, you are back to editing application code. A gateway decouples provider selection from application logic.

That is what flo2 is built for. flo2 is a developer-first LLM gateway with zero token markup. You bring your own Mistral key — plus keys for OpenAI, Anthropic, Gemini, Groq, Cerebras, DeepInfra, and others — and pay each provider directly at their published rates. A single flo2 key, accessed through an OpenAI-compatible or Anthropic-compatible endpoint, routes each request to the cheapest or fastest provider, with automatic fallback chains so a Mistral 429 rolls over to another provider instead of surfacing as an error. Free during Beta.

import os
from openai import OpenAI

# One stable base URL — flo2 routes to Mistral (or best available provider)
client = OpenAI(
    base_url="https://flo2.com/v1",
    api_key=os.environ["FLO2_API_KEY"],
)

resp = client.chat.completions.create(
    model="mistral-small-latest",   # pin to Mistral, or let flo2 route automatically
    messages=[
        {"role": "user", "content": "Summarize this pull request diff."},
    ],
)

print(resp.choices[0].message.content)

Because flo2 exposes the same OpenAI-compatible surface you just used against Mistral, switching is a base_url and api_key change — identical to the Mistral migration itself. You get Mistral's models when they are the best fit, automatic fallback when they are not, AI racing to whoever responds first, and per-call cost accounting across every provider in one view.

For a full walkthrough of Mistral's model lineup, API key setup, pricing structure, and code models, see the Mistral API guide. For the broader picture of how OpenAI-compatible endpoints work across providers, see OpenAI-compatible API. To start routing Mistral requests with zero markup and automatic fallback, get started with flo2.

One key, every model — zero markup.
Bring your own provider keys. flo2 routes to the cheapest, fastest model with fallback, racing and true cost accounting — free during Beta.
Get your flo2 key →
© 2026 flo2.com — the zero-markup LLM gateway & router. flow → to