2026-06-03 · flo2 blog

Changing the OpenAI Base URL: Point the SDK Anywhere

The OpenAI base URL is the single string that controls where every request from the official SDK lands. Change it and you instantly redirect your existing code to a proxy, an alternative provider, a locally running model, or a test mock — without touching a single line of prompt logic. This guide covers exactly how to do that in Python and Node, the environment variables that make it config-level instead of code-level, the sharp edges you'll run into, and why a single gateway base URL is the cleanest solution when you want access to many models through your existing OpenAI code.

Why you would override the OpenAI base URL

The default base URL — https://api.openai.com/v1 — is just an HTTP host plus a path prefix. The SDK knows nothing else special about it. Pointing it elsewhere is a fully supported, first-class operation, and there are four common reasons teams do it:

Use a proxy or gateway. A gateway like flo2 sits between your code and the upstream models. You get a single base URL that routes to any provider, letting you swap models or providers with a config change instead of a code change.
Use an OpenAI-compatible provider. Groq, Mistral, Together, Cerebras, xAI, DeepInfra, and many others all expose the same Chat Completions shape at their own host. Point the SDK at their endpoint and you get their models — and their pricing — with zero SDK changes.
Run a local model. Ollama, LM Studio, and llama.cpp all expose a local OpenAI-compatible server. Set the base URL to http://localhost:11434/v1 (Ollama's default) and your code runs offline, which is invaluable for development and testing.
Mock the API in tests. A lightweight server like fastapi or msw that returns hard-coded Chat Completions responses lets your test suite run without hitting the network or burning tokens.

How to change the OpenAI base URL in Python

Pass base_url to the OpenAI constructor. The client will prepend it to every API path it calls:

from openai import OpenAI

client = OpenAI(
    base_url="https://gateway.flo2.com/v1",
    api_key="YOUR_FLO2_KEY",
)

response = client.chat.completions.create(
    model="anthropic/claude-opus-4-5",
    messages=[{"role": "user", "content": "Explain base URL overrides in one paragraph."}],
)
print(response.choices[0].message.content)

The AsyncOpenAI client takes the exact same arguments:

import asyncio
from openai import AsyncOpenAI

client = AsyncOpenAI(
    base_url="https://gateway.flo2.com/v1",
    api_key="YOUR_FLO2_KEY",
)

async def main():
    response = await client.chat.completions.create(
        model="openai/gpt-4.1",
        messages=[{"role": "user", "content": "Hello"}],
    )
    print(response.choices[0].message.content)

asyncio.run(main())

How to change the OpenAI base URL in Node / TypeScript

The Node SDK uses baseURL (camelCase) in the constructor options object:

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://gateway.flo2.com/v1",
  apiKey: process.env.FLO2_API_KEY,
});

const response = await client.chat.completions.create({
  model: "google/gemini-2.5-pro",
  messages: [{ role: "user", content: "What is an LLM gateway?" }],
});

console.log(response.choices[0].message.content);

Using environment variables to override the base URL

Both SDKs respect environment variables, so you can change the endpoint without touching code at all — useful in CI pipelines, Docker Compose files, and Kubernetes configs.

The Python SDK reads OPENAI_BASE_URL:

# .env or shell
export OPENAI_BASE_URL="https://gateway.flo2.com/v1"
export OPENAI_API_KEY="YOUR_FLO2_KEY"

# Your code stays completely unchanged
from openai import OpenAI
client = OpenAI()  # picks up both env vars automatically

Older versions of the Python SDK used OPENAI_API_BASE instead. If you are on openai<1.0.0, set that variable rather than OPENAI_BASE_URL. Versions 1.x and later use OPENAI_BASE_URL exclusively.

The Node SDK reads the same OPENAI_BASE_URL environment variable starting with openai@4:

# docker-compose.yml
services:
  api:
    image: my-app
    environment:
      OPENAI_BASE_URL: https://gateway.flo2.com/v1
      OPENAI_API_KEY: ${FLO2_API_KEY}

What actually happens when you override the base URL

When you call client.chat.completions.create(...), the SDK constructs a POST request to {base_url}/chat/completions. Notice there is no extra /v1 segment added by the SDK — it appends the resource path directly onto whatever base URL you supply. So:

If your base URL is https://gateway.flo2.com/v1, the request lands at https://gateway.flo2.com/v1/chat/completions. Correct.
If your base URL is https://gateway.flo2.com (no /v1), the request lands at https://gateway.flo2.com/chat/completions. Wrong — you'll get a 404.
If your base URL is https://gateway.flo2.com/v1/ (trailing slash), the Python SDK strips the trailing slash automatically in recent versions, but older versions may produce a double-slash path. Safest: omit the trailing slash.

Gotchas and sharp edges

The api_key field is still required

Even when talking to a local Ollama instance that ignores auth, the SDK validates that api_key (or OPENAI_API_KEY) is set. Pass any non-empty string — "ollama" or "local" works fine — if the endpoint doesn't actually check it.

from openai import OpenAI

# Ollama running locally — api_key is ignored by the server but required by the SDK
client = OpenAI(
    base_url="http://localhost:11434/v1",
    api_key="ollama",
)

Feature parity is not guaranteed

The OpenAI-compatible API spec covers the core request/response shape, but optional features vary by provider. Structured outputs (response_format: { type: "json_schema" }), function/tool calling, vision inputs, and the logprobs field may or may not be supported on the endpoint you point the SDK at. Always test your specific feature surface against the target endpoint rather than assuming full parity with OpenAI.

Model names are provider-specific

The model field passes through verbatim to the endpoint. A gateway that aggregates multiple providers typically uses a namespaced format like openai/gpt-4.1 or anthropic/claude-opus-4-5 to disambiguate. Direct provider endpoints use their own naming convention. Read the target API's model list before switching.

Streaming uses the same code path

If your code uses stream=True, the overridden base URL is still in effect. A well-implemented endpoint (or gateway) will return proper SSE chunks and a data: [DONE] terminator. If streaming breaks, verify the endpoint supports it — not all compatible endpoints implement streaming.

One gateway base URL for every model

Pointing the SDK at individual provider endpoints works, but it still means one base URL per provider and a proliferation of API keys. A unified LLM API gateway collapses that: you set one base URL, one key, and then select any supported model with the model field.

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://gateway.flo2.com/v1",
  apiKey: process.env.FLO2_API_KEY,
});

// Switch providers by changing one string — no other code changes
const models = [
  "openai/gpt-4.1-mini",
  "anthropic/claude-haiku-4-5",
  "google/gemini-2.0-flash",
  "groq/llama-3.3-70b-versatile",
];

for (const model of models) {
  const res = await client.chat.completions.create({
    model,
    messages: [{ role: "user", content: "What are you?" }],
  });
  console.log(model, "→", res.choices[0].message.content.slice(0, 80));
}

With flo2, you bring your own provider keys — the gateway routes requests using your credentials, so you pay providers at cost with zero token markup. The base URL doesn't change as you add providers; the model string is the only thing that varies.

If you have existing OpenAI code and want access to every major model without a rewrite, set your base URL to flo2 — it's OpenAI-compatible, free during beta, and your provider keys stay yours.

One key, every model — zero markup.

Bring your own provider keys. flo2 routes to the cheapest, fastest model with fallback, racing and true cost accounting — free during Beta.

Get your flo2 key →