2026-06-03 · flo2 blog

BYOK vs Credits: Two Ways to Pay for LLM Access

When you sign up for an LLM service, you quickly face a fork: pay into a shared credit balance or bring your own key and pay providers directly. The byok vs credits question is not just about price — it shapes billing transparency, compliance posture, vendor risk, and how much your costs grow with scale. This article breaks down both models, shows where each one wins, and helps you choose the right approach for where your project is today.

How Prepaid LLM Credits Work

The credit model is the most common entry point for developers experimenting with language models. You deposit funds into a platform balance — think OpenRouter, a hosted gateway, or any aggregator — and that balance is debited as you make calls. The platform holds accounts with the underlying providers, buys inference at negotiated or wholesale rates, and resells tokens to you at its own pricing.

This model is genuinely convenient, especially early on:

The trade-off is that you are buying resold inference. The platform sets its own prices, and the spread between what it pays providers and what it charges you is how it earns revenue. That spread is not always obvious — per-token rates may look close to list price, but deposit fees, minimum balances, or blended routing across cheaper infrastructure can obscure the real cost. You also inherit the aggregator's rate limits, uptime, terms of service, and data-handling practices, because your traffic runs through its infrastructure using its provider accounts.

How BYOK (Bring Your Own Key) Works

The BYOK explained pattern decouples the convenience layer from the billing relationship. You create API keys directly in each provider's dashboard — OpenAI, Anthropic, Google, Groq, Mistral — and register those keys with a gateway. When your application calls the gateway, it routes the request to the right provider using your key. The provider charges your account at its published list price. The gateway never sits in the token transaction at all.

What you gain:

The real costs of BYOK are operational, not financial. You manage multiple provider accounts, multiple invoices, and multiple API keys. Onboarding a new provider means creating an account and going through any verification process they require. For very early-stage projects, this overhead can slow you down when you just want to prototype.

Comparing the Two Models

Dimension Prepaid credits / aggregator BYOK (direct provider keys)
Token pricing Aggregator's rate; markup possible Provider list price; no markup
Volume discounts Aggregator's negotiated rates (may or may not pass through) Your own discounts apply automatically
Billing One balance, one invoice Separate invoice per provider
Model access Instant; aggregator handles accounts You must open accounts per provider
Data path Through aggregator infrastructure Direct to provider under your account
Compliance Inherits aggregator's terms and DPA Your DPA with each provider; auditable
Vendor lock-in Balance tied to aggregator Gateway-agnostic; keys are yours
Setup effort Minutes Hours (one-time, per provider)
Cost transparency Can be opaque across providers Fully transparent; reconciles to invoice

When Credits Make Sense

Prepaid credits are a reasonable choice in specific situations:

Credits are a practical starting point. The issue is that many teams stay on them longer than makes sense, even as volume grows and the markup compounds.

When BYOK Makes Sense

The balance tilts toward BYOK as soon as any of the following apply:

A note on the transition

Teams often start on credits and move to BYOK when their volume justifies it. The transition is not painful if your gateway code is already abstracted behind a single endpoint — you update key configuration, not application code. The main work is opening provider accounts, which is a one-time task.

flo2: A BYOK Gateway With No Token Markup

flo2 is built specifically for the BYOK model. You register your provider API keys, and flo2 uses them to route requests — applying fallback, model selection, and cost tracking — while charging nothing on top of what providers charge you. There is no token resale, no credit balance to manage, and no markup to account for. During the current beta, the gateway service itself is free.

For teams running meaningful LLM workloads who want routing and observability without giving up price transparency, flo2 is worth evaluating alongside the aggregator options you may already be using. If you are already on a credit model and the math is starting to matter, the comparison is straightforward: add your provider keys, point your application at the flo2 endpoint, and compare what you see on provider invoices against what you were paying before.

Both models serve real needs. Credits are the right starting point for many projects. BYOK is usually the right destination once volume, compliance, or cost control becomes a serious concern — and a zero-markup LLM gateway like flo2 is how you get the routing benefits of an aggregator without the token margin.

One key, every model — zero markup.
Bring your own provider keys. flo2 routes to the cheapest, fastest model with fallback, racing and true cost accounting — free during Beta.
Get your flo2 key →
© 2026 flo2.com — the zero-markup LLM gateway & router. flow → to