2026-06-03 · flo2 blog

OpenRouter vs LiteLLM: Hosted Aggregator vs Self-Hosted Proxy

The OpenRouter vs LiteLLM question comes up the moment you decide to stop hand-wiring a separate SDK for every model provider. Both give you one interface in front of OpenAI, Anthropic, Google, Mistral, and many others — but they answer the problem from opposite ends. OpenRouter is a hosted aggregator: you load credits and call a vast model catalog through someone else's infrastructure. LiteLLM is an open-source SDK and proxy you run yourself to unify those providers under your own keys. This guide compares them fairly on setup and ops, pricing, control, observability, and lock-in, then shows where a hosted-but-BYOK option sits between them. Both are good tools; the right pick depends on what you are optimizing for.

If you want the background first, our explainers on what is OpenRouter and what is LiteLLM cover each one in depth. This piece assumes you roughly know what they do and are deciding which to adopt.

OpenRouter vs LiteLLM: the fundamental difference

Strip away the feature lists and the two products differ on one axis: who runs the thing and who holds the provider relationship.

OpenRouter is a managed service. You sign up, top up a prepaid balance, and get a single OpenAI-compatible key that reaches hundreds of models — including providers you have no account with. OpenRouter operates the routing infrastructure, buys inference upstream, and bills you from your credit balance. Nothing to deploy, and you can be calling frontier models in minutes.

LiteLLM is software, not a service. The core is an open-source Python SDK that normalizes 100+ providers behind an OpenAI-style call; the LiteLLM Proxy wraps that into a server you host yourself, exposing one endpoint with keys, budgets, and logging. Crucially, the requests go out under your provider keys, so you pay OpenAI, Anthropic, and the rest directly. There is no reseller in the money path because you are the one running the path.

So the choice is rarely "which is better software." It is: do you want a hosted catalog with zero ops, or do you want to own the proxy and pay providers at list price? Everything below follows from that fork.

Setup and operations: zero-ops vs you-run-it

This is the most concrete difference and often the deciding one.

With OpenRouter, setup is a base URL, an API key, and a credit top-up. There is no server to stand up, scale, secure, or patch. Availability, capacity, and upgrades are the platform's problem. For a small team or an early-stage product, that is a meaningful amount of work you simply never do.

With LiteLLM, the SDK is trivial to start with — a pip install and you are normalizing calls in your own code. The proxy is where ops enters. Running it in production means you:

Deploy and scale it. Containers, autoscaling, and a high-availability setup so the proxy does not become a single point of failure in front of every model call.
Secure it. Provider key storage, network policy, auth, secret rotation, and timely upgrades are all yours.
Monitor it. Uptime, latency, and alerting have to be wired into your stack. If the proxy is down at 2am, your LLM features go with it.
Maintain it. Tracking releases and keeping pace with provider API changes is ongoing engineering time.

None of this is a knock on LiteLLM — running your own proxy is exactly the point for teams that want control. It is just real work that OpenRouter absorbs for you. Be honest about your team's capacity here, because it is the trade-off that bites later, not on day one.

Pricing model: credits and possible markup vs pay-providers-directly

The economics differ as sharply as the ops.

OpenRouter sells you inference through a prepaid balance. Value flows provider → OpenRouter → you, and an aggregator that resells access has to make its money somewhere — historically through a small fee on credit purchases and, for some routes, a margin or spread. None of that is dishonest, but it means the price you pay is not always the provider's exact published rate, and mapping spend line-by-line to provider list prices can be fiddly. In exchange you get one invoice instead of reconciling five, and instant access to models you have no account with.

LiteLLM adds no token markup by design: it is your proxy, calling under your keys, so you pay each provider directly at list price. The "cost" is the infrastructure you run and the engineering time to keep it healthy — a fixed operational cost rather than a per-token spread. If you already have committed spend, negotiated discounts, or free credits with a provider, LiteLLM lets you use them directly; routing through a reseller can mean leaving some of that on the table.

Control, observability, and lock-in

Control and customization. LiteLLM wins on raw control: it is open source, runs in your own network, and you can fork, patch, and configure routing strategies, retries, and fallbacks however you like. OpenRouter trades that flexibility for convenience — you configure within its product surface, not its source.

Observability. OpenRouter gives you a hosted dashboard of usage and spend out of the box. LiteLLM emits rich logs and integrates with tools like Langfuse, Prometheus, and OpenTelemetry — but turning that into a usable view of spend per model, per app, and per user is work you assemble yourself. The platform hands you a dashboard; the self-hosted proxy hands you the data and the wiring.

Lock-in. Both keep your application code portable, since each speaks an OpenAI-compatible surface, so swapping is mostly a base-URL change. The deeper question is where your relationships and balance live. With OpenRouter, your catalog access and prepaid credits sit on the platform. With LiteLLM, your provider keys and accounts remain entirely yours — the proxy is just a process you control.

OpenRouter vs LiteLLM vs a hosted BYOK gateway: comparison

Here is the balanced, three-way view. The third column is the hybrid: hosted like OpenRouter, but BYOK and zero-markup like LiteLLM.

Dimension	OpenRouter (hosted aggregator)	LiteLLM (self-hosted proxy)	Hosted BYOK gateway (e.g. flo2)
Who runs it	OpenRouter; nothing to deploy	You deploy, scale, and operate the proxy	Managed for you; nothing to deploy
Time to first call	Minutes — top up credits, change base URL	After you stand up and configure the server	Minutes — connect your keys, change base URL
Pricing model	Prepaid credits bought through the platform	You pay providers directly under your keys	BYO keys; you pay providers directly
Markup	Possible fee or spread on routed tokens	None — your proxy, your keys	Zero markup; not in the money path
Provider relationship	Held by the aggregator	Held by you	Held by you (discounts, DPAs, enterprise terms)
Model catalog	Very broad; access without your own accounts	100+ providers via config and your keys	Major providers via your keys (OpenAI, Anthropic, Gemini, Groq, Cerebras, DeepInfra, Mistral, xAI, OpenRouter)
Routing & fallback	Yes, across upstream providers	Yes (strategies, retries, fallbacks)	Yes: smart routing, fallback chains
Racing & A/B + judge	Limited	Build it yourself	Built in: racing, A/B with a model-fit judge
Observability	Hosted dashboard	Logs + integrations you wire up	True per-call cost accounting + dashboard
API compatibility	OpenAI-compatible	OpenAI-compatible	OpenAI- and Anthropic-compatible
Ops burden	On the platform	On your team (upgrades, security, uptime)	On the provider

Read this as a preference map, not a scoreboard. None of the three is the universal answer.

Who each one suits

Choose OpenRouter if you want the broadest possible catalog behind one balance with essentially zero setup, you value a single invoice, and you want to reach models you have no provider account for. For prototyping, breadth, and "I just want it working today," it is hard to beat — and the convenience is worth a small spread to many teams.

Choose LiteLLM if you want maximum control and direct-to-provider pricing, you need the gateway running inside your own network for data-residency or governance reasons, and you have the operational capacity to deploy, secure, and monitor a service in the critical path. It is a mature, well-respected open-source project and the right call when ownership matters more than convenience.

Consider the hybrid — hosted but BYOK — if you want LiteLLM's economics (pay providers directly, no markup) without LiteLLM's ops, and you want a managed dashboard on day one. That is the gap the third column fills.

Where flo2 fits between them

flo2 is a developer-first, hosted LLM gateway built to sit exactly where the two approaches leave a gap. Like OpenRouter, it is fully hosted — there is no proxy to deploy, scale, or patch. Like LiteLLM, it is BYOK with zero token markup: you connect your own provider accounts — OpenAI, Anthropic, Gemini, Groq, Cerebras, DeepInfra, Mistral, xAI, and OpenRouter itself — and pay each provider directly at their real price. flo2 never sits in the money path, so there is no per-token margin on top.

In exchange for one key that is drop-in compatible with both the OpenAI and Anthropic APIs, you get the routing layer most teams otherwise self-host:

Smart routing sends each request to the cheapest or fastest qualifying model, so trivial calls do not hit a flagship at flagship prices.
Fallback chains fail over across providers and models when one errors or rate-limits, instead of dropping the request.
AI racing fires the same prompt at several models in parallel and returns the fastest response when latency matters.
A/B testing with a judge scores "model–task fit" on your real traffic, so you promote winners on evidence, not vibes.
Opt-in caching stops you paying for the same answer twice.
True per-call cost accounting in a managed dashboard — real dollars per request, reconcilable against your provider invoices.

The honest framing, the same one that applies to each tool above: BYOK means you still own your provider keys, quotas, and rate limits, and you need accounts with the providers you want to route to. flo2 takes the deployment, scaling, and monitoring off your plate; it does not take over your provider relationships — which is the point, since you keep any discounts or enterprise terms you have negotiated. And unlike a self-hosted proxy, the catalog is the set of providers you connect, not every backend LiteLLM supports. flo2 is free during Beta.

The bottom line

In the litellm vs openrouter decision, OpenRouter optimizes for convenience and catalog breadth through a hosted, credit-based model, while LiteLLM optimizes for control and direct-to-provider pricing through a proxy you run yourself. Neither is wrong; they simply make opposite trade-offs on the run-it-yourself axis. If you want OpenRouter's zero-ops experience with LiteLLM's pay-providers-directly economics, a hosted BYOK gateway is the natural middle path. Point your existing OpenAI or Anthropic SDK at flo2, connect your own keys, and compare the real costs against your invoices — it is free to try during Beta.

One key, every model — zero markup.

Bring your own provider keys. flo2 routes to the cheapest, fastest model with fallback, racing and true cost accounting — free during Beta.

Get your flo2 key →