OpenRouter Pricing Explained (2026): Credits, Fees & BYOK
If you route LLM traffic through a hosted aggregator, sooner or later you ask the obvious question: what is OpenRouter pricing actually made of, and how much does OpenRouter cost compared with calling a provider yourself? OpenRouter is a genuinely useful product — one API key reaches hundreds of models from OpenAI, Anthropic, Google, Meta, Mistral, and more — but the way it bills is layered, and "the price of the model" is only part of the story. This guide breaks the model down conceptually so you can read your own bill, spot where margin can hide, and decide when a bring-your-own-key (BYOK) gateway is the cheaper long-run path.
One ground rule up front: OpenRouter's exact fees and percentages change over time, so this article explains the structure rather than quoting numbers that will be stale by the time you read them. For current figures, always confirm on OpenRouter's own pricing page.
How OpenRouter pricing works (the three layers)
It helps to think of what you pay as a stack of separate components, not a single rate. At a high level there are three.
1. The underlying model's per-token price
Every model is quoted per million tokens, split into input (your prompt) and output (the reply), with output usually costing several times more than input. OpenRouter passes these per-token prices through from the upstream provider. This is the bulk of most bills, and for a given model it tracks what that provider charges. Because OpenRouter can route the same model across multiple upstream hosts, the exact per-token rate for a model can vary by which provider serves it.
2. The credits / payment-processing component
OpenRouter works on prepaid credits: you load a balance with a card or other method, and per-call usage is drawn down from it. Buying credits is where a platform fee or payment-processing component typically lives — the cost of topping up your balance is not always identical to the credit value you receive. This is normal for a hosted service (someone has to cover card fees and run the infrastructure), but it means a dollar of spend on the platform is not always a dollar of raw inference. The precise mechanics and any minimums are exactly the kind of thing to verify on the current pricing page.
3. The BYOK option (and its possible surcharge)
OpenRouter also offers a bring-your-own-key mode: you attach your own provider API keys and OpenRouter routes through them, so the underlying tokens are billed to your provider account rather than drawn from credits. The convenience layer stays, but using your own keys through the platform may carry its own surcharge on the routed traffic. Whether that surcharge applies, and how it is calculated, is a current-pricing detail — check before assuming BYOK-through-an-aggregator equals provider list price.
Where "markup" can hide
"Markup" is a loaded word, so let's be precise and fair. OpenRouter is not doing anything dishonest — a hosted aggregator has real costs and is entitled to cover them. The point is that several small components can sit between the provider's published price and your effective cost, and they are not always easy to see line by line:
- On the credits path: a fee or spread when you buy or spend credits means your effective per-token cost can be slightly above the provider's list price.
- On the BYOK path: a percentage surcharge on traffic routed through your own keys.
- In routing choice: when a model is available from several upstream providers at different prices, which one served your request affects what you paid — and that is abstracted away from you.
- In blended reporting: usage shown as a credit drawdown is harder to map back to "model X charged me $Y for this exact call" than a direct provider invoice.
None of this is a reason to avoid OpenRouter. It is a reason to know which components apply to your account so you can compare honestly against paying a provider directly.
How to compare true cost
To answer "how much does OpenRouter cost me" versus an alternative, normalize everything to the same unit: effective dollars per million input and output tokens for the specific models you actually use. A clean comparison looks like this:
| Cost element | Resold-credits aggregator (e.g. OpenRouter) | BYOK zero-markup gateway (e.g. flo2) |
|---|---|---|
| Per-token model price | Passed through from the upstream provider | Provider's list price, paid directly |
| Top-up / payment component | Possible fee or spread when buying credits | None — you pay the provider, not the gateway |
| BYOK surcharge | Possible surcharge on routed own-key traffic | Zero markup; the gateway is not in the money path |
| Cost visibility | Credit drawdown; can be blended/abstracted | True per-call cost at provider list prices |
| Provider relationship | Held by the aggregator | Held by you (discounts, committed spend, DPAs) |
The practical recipe: pull the per-million input and output prices for your top few models, multiply by your real token mix, then add any platform fee or surcharge that applies to your usage on top. Do the same arithmetic for paying the provider directly. The difference is your "convenience tax," and whether it is worth it depends entirely on your volume and how much you value not managing keys. To make the model-price side of that math concrete, flo2's LLM price comparison lays out per-token input/output prices across providers so you can sanity-check what the underlying tokens should cost before any platform fee.
The real trade-off: convenience vs. zero markup
Strip away the detail and the decision is simple. OpenRouter buys you enormous convenience: one balance, one key, hundreds of models including ones you have no account for, and built-in routing and fallback across upstream providers. For prototypes, spiky workloads, and teams that do not want to manage a dozen provider accounts, that convenience is the value, and the layered pricing is a fair price for it.
A BYOK zero-markup gateway makes the opposite trade. You keep your own accounts with OpenAI, Anthropic, Google Gemini, Groq, Cerebras, DeepInfra, Mistral, and xAI; the gateway holds your keys and orchestrates calls; and you pay each provider directly at their real price. The gateway charges nothing on top because it never sits in the money path. You hold the provider relationship — so enterprise terms, volume discounts, and existing credits stay yours — and you see exactly what every call cost. The honest caveat: you manage those keys, quotas, and rate limits. That is a small operational responsibility traded for transparency and no per-call tax.
Neither answer is universally right. The clean rule of thumb: if you are optimizing for setup speed and catalog breadth, an aggregator's convenience wins; if you are optimizing for cost transparency and you already hold (or are happy to hold) provider keys, a zero-markup BYOK gateway wins — and the more volume you push, the more a recurring percentage on every call matters.
Where flo2 fits
flo2 is the zero-markup BYOK alternative for teams that already have provider keys. You connect your own keys, set per-model prices, and flo2 gives you a single endpoint that is drop-in compatible with both the OpenAI and Anthropic APIs. It routes each request to the cheapest or fastest qualifying model, with fallback chains, AI racing for latency, A/B testing with an LLM judge to find the best model–task fit on your real traffic, and opt-in response caching so you never pay for the same answer twice. Because flo2 is not in the money path, it adds zero token markup and logs true cost per call at provider list prices — you pay providers directly. It is free during Beta, so the cheapest way to compare your real numbers against an aggregator's blended rate is to point a key at it and watch the per-call costs.
To go deeper, see what is an LLM gateway for how routing and fallback work under the hood, and the full OpenRouter alternative breakdown for a side-by-side on pricing, control, and lock-in. Whichever you choose, the move that pays off most is the same: normalize to effective dollars per million tokens, confirm the current fees on the platform's own pricing page, and decide with numbers rather than vibes.