2026-06-03 · flo2 blog

What Is LiteLLM? The Open-Source LLM Proxy, Explained

If you've shipped anything on top of large language models, you've probably asked some version of what is LiteLLM and whether it belongs in your stack. The short answer: LiteLLM is a popular open-source project that gives you one OpenAI-style interface for 100+ model providers, available both as a Python SDK and as a self-hostable proxy server. This guide explains LiteLLM in plain terms for developers, covers what you get, the tradeoffs of running it yourself, and when a managed gateway is the simpler call.

What is LiteLLM, exactly?

LiteLLM is an open-source library and proxy that standardizes how you call language models. Instead of learning a different request and response shape for every vendor, you write OpenAI-format calls and LiteLLM translates them to whichever provider you target — OpenAI, Anthropic, Google Gemini, Azure, Bedrock, Mistral, and many more. It comes in two main flavors:

The core idea is unification. One interface, many backends. That alone removes a lot of glue code, but the proxy adds the operational features teams usually want once more than one service starts calling models.

LiteLLM in Python: the SDK

The SDK is the fastest way to understand the project. A call looks like a normal OpenAI request, just with a provider-prefixed model name:

from litellm import completion

resp = completion(
    model="anthropic/claude-sonnet-4",
    messages=[{"role": "user", "content": "Summarize this in one line."}],
)
print(resp.choices[0].message.content)

Swap the model string to gpt-4o or gemini/gemini-1.5-pro and the rest of your code stays the same. That portability is the headline feature: you can switch providers, run experiments, or add a fallback without rewriting request and response handling.

What you get with the LiteLLM proxy

Used as the SDK, LiteLLM is mostly about a unified call surface. Run it as a proxy and it becomes a small gateway that sits between your apps and the providers. The commonly used capabilities include:

That feature set is why LiteLLM shows up so often in "litellm vs" discussions: it covers the practical needs of a shared LLM access layer in a single, well-maintained open-source package.

Self-hosting LiteLLM: the upside

Running LiteLLM yourself is appealing for good reasons, and for many teams it's the right choice:

If you value owning the deployment and already operate services confidently, self-hosting LiteLLM is a strong, mature option.

Self-hosting LiteLLM: the cost

The flip side of control is operations. When you run the proxy, it's yours to keep healthy:

None of this is unusual for infrastructure, but it's real engineering time. For a small team that just wants reliable, cheap model access, standing up and babysitting a proxy can be more than the problem warrants.

Self-host vs hosted: the tradeoff

Here's the balanced version of the decision:

It's genuinely about preference and team capacity. If running a proxy is no burden — or you specifically want everything in your own environment — LiteLLM is excellent. If you'd rather not operate one more service, a hosted gateway gets you the same unified interface and routing without the ops overhead.

When a managed gateway is the simpler choice

This is where a hosted option earns its keep. flo2 is a developer-first, hosted LLM gateway with zero token markup: you bring your own provider keys (OpenAI, Anthropic, Gemini, Groq, Cerebras, DeepInfra, Mistral, xAI, OpenRouter) and pay the providers directly. One key — OpenAI- and Anthropic-compatible — routes to the cheapest or fastest model that fits the request, with no proxy for you to deploy, scale, secure, or monitor.

Beyond unified access, flo2 adds smart routing, fallback, and racing, plus A/B testing with an LLM judge for "model–task fit," response caching, and true cost accounting — all behind a managed dashboard. It's free during Beta. If LiteLLM answers "how do I unify my model calls," flo2 answers "how do I get that without running the infrastructure myself."

The honest framing: LiteLLM is a great open-source choice when you want to self-host and own the deployment. flo2 is the hosted, zero-markup alternative when you'd rather skip the proxy entirely. Many developers will try both before deciding which fits.

Want to go deeper? Read what is an LLM gateway for the broader concept, or see our best LLM gateway comparison to weigh the options side by side. When you're ready to route without running anything, give flo2 a try.

One key, every model — zero markup.
Bring your own provider keys. flo2 routes to the cheapest, fastest model with fallback, racing and true cost accounting — free during Beta.
Get your flo2 key →
© 2026 flo2.com — the zero-markup LLM gateway & router. flow → to