2026-06-03 · flo2 blog

Databricks (Mosaic) AI Gateway Explained: Features & Fit

If your data and machine learning workloads already live inside Databricks, the Databricks AI Gateway — sometimes called Mosaic AI Gateway or Databricks Model Serving Gateway — is the governance and routing layer that comes with the platform. It lets you call external LLM providers and Databricks-hosted models through a unified, policy-controlled endpoint, without leaving your lakehouse. This article explains what the Databricks AI Gateway is, what it does well, what to weigh before relying on it, and where a standalone, provider-agnostic gateway fits for teams outside the Databricks ecosystem. For a category primer, see what is an AI gateway.

What is the Databricks AI Gateway?

The Databricks AI Gateway (also surfaced under the Mosaic AI umbrella in Databricks' product naming) is a governance and access layer built into the Databricks platform. It provides a single endpoint through which authorized users and applications can reach both external LLM providers — such as OpenAI, Anthropic, and others — and Databricks-hosted models served via Databricks Model Serving. Rather than wiring every notebook, pipeline, or application directly to each provider's API, teams configure routes through the gateway and manage access, rate limits, and usage in one place.

The mental model is a governance and access control layer, not a standalone router or a token reseller. Databricks does not sell you inference tokens through the gateway in place of a provider; the value is in unified access control, audit logging, payload logging, rate limiting, and usage tracking — all anchored in the Databricks platform and integrated with Unity Catalog for permissions. That framing sets expectations for who this product is built for and what it optimizes around.

How the Databricks (Mosaic) AI Gateway works

The gateway is configured inside your Databricks workspace:

For current configuration steps, supported providers, and API shapes, consult Databricks' official documentation — these evolve with platform releases.

Strengths of the Databricks AI Gateway

Considerations before committing

The Databricks AI Gateway is a well-engineered fit for a specific context. Before treating it as a general-purpose LLM gateway, weigh these points:

Databricks AI Gateway vs. a standalone, provider-agnostic gateway

The table below captures the structural differences. Always verify current feature details against each product's documentation.

Dimension Databricks AI Gateway Standalone BYOK gateway (e.g. flo2)
What it is Governance + access layer inside Databricks Developer-first LLM router/proxy, any stack
Ecosystem requirement Requires Databricks workspace Any HTTP, OpenAI, or Anthropic client
Primary design goal Governance, access control, audit, payload logging Routing, fallback, racing, cost accounting
Token markup Part of Databricks platform billing Zero markup — you pay providers directly (BYOK)
Dynamic routing Fixed route to configured endpoint Cheapest/fastest routing, fallback chains, racing
Payload logging Delta tables in your lakehouse Per-call cost accounting at provider list prices
API compatibility Databricks-native; check docs for OpenAI mode Drop-in OpenAI- and Anthropic-compatible key

Where flo2 fits for teams outside the Databricks ecosystem

If your stack does not revolve around Databricks — or you want a routing layer that works regardless of your ML platform — the needs look different. You want a gateway that treats intelligent routing and honest cost accounting as first-class concerns, not as by-products of a larger platform's governance layer.

flo2 is a developer-first LLM gateway built around exactly that job. You bring your own keys for providers like OpenAI, Anthropic, Google Gemini, Groq, Cerebras, DeepInfra, Mistral, and xAI, and flo2 exposes a single key that is drop-in compatible with both the OpenAI API and the Anthropic API. There is zero token markup — flo2 never sits in the money path; you pay each provider directly at their published rates. What flo2 adds on top of that routing layer:

flo2 does not offer Unity Catalog integration, payload logging to Delta tables, or Databricks-specific governance. If that is your need, Databricks is the right product. If you need a portable, zero-markup routing layer that drops into any codebase with a key swap, the Databricks gateway is not the right shape for the job.

How to decide

For a broader comparison, see our best LLM gateway comparison. If you are evaluating a standalone gateway, flo2 is free during its Beta — point an existing OpenAI or Anthropic SDK at it and compare routing, cost, and reliability against your current setup with no contract and no token markup.

One key, every model — zero markup.
Bring your own provider keys. flo2 routes to the cheapest, fastest model with fallback, racing and true cost accounting — free during Beta.
Get your flo2 key →
© 2026 flo2.com — the zero-markup LLM gateway & router. flow → to