Why Your AI Agents Need a Semantic Layer
AI agents querying raw SQL produce inconsistent, ungoverned results. A semantic layer fixes that. Here's what goes wrong without one, and how metrics-as-code changes the architecture.
Give an AI agent access to your data warehouse and it will write SQL. It might even write good SQL. But ask two agents the same revenue question and you'll get two different numbers. Neither matches finance's report. That's the core problem: text-to-SQL gives agents access to data, not understanding of it.
A semantic layer for AI agents fixes this by putting a governed metrics layer between the agent and the warehouse. The agent queries defined metrics instead of raw tables. Every consumer, whether it's an AI agent, a dashboard, or an API, gets the same answer because the calculation is fixed, not interpreted per query.
This isn't a new idea. Semantic layers have existed in BI for decades. What's new is that AI agents make the problem worse and the solution more urgent.
What actually goes wrong without a semantic layer?
Three failure modes show up in production. All of them trace back to the same root cause: the agent interprets business logic on every query instead of referencing a shared definition.
Inconsistent answers
Ask Claude "What was Q1 revenue?" through text-to-SQL. It writes a query summing amount from orders where created_at is between January 1 and March 31. Reasonable.
Ask GPT the same question. It filters on status = 'completed' first, then sums. Different number. Also reasonable.
Neither agent knows that your finance team excludes refunds and trial conversions from revenue. That logic lives in a Notion doc that got updated six months ago. The agents don't have access to it. They generate plausible SQL from column names and return plausible numbers that don't match anything your team reports.
This isn't a model quality problem. It's a context problem. The agent sees schema, not semantics.
No access control
You're building a B2B product. Customer A asks your agent a question. The agent writes SQL against the warehouse. Nothing in the text-to-SQL pipeline enforces that Customer A only sees Customer A's data.
You can patch this. Add tenant filters to prompts. Write middleware that rewrites queries. Build a validation layer. Each patch is a new surface for bugs. One missed filter and you've got a data leak in production.
RBAC needs to be structural, not bolted on. It should be impossible for an agent to return data outside its authorized scope, regardless of what SQL it generates.
No audit trail
When your CFO asks "where did this number come from?", the answer is "an LLM generated some SQL." That's not an audit trail. You can't trace which metric definition was used because there wasn't one. You can't reproduce the result because the prompt context has changed. You can't prove the number was correct because correctness wasn't defined.
Metric governance requires a fixed, versioned definition that every consumer references. Without it, every query is an ad hoc interpretation of raw data.
What is a semantic layer for AI agents?
A semantic layer is a metadata layer between your data warehouse and every data consumer. It defines business metrics, relationships between tables, and access rules in one place. Consumers query metric definitions instead of raw tables.
For AI agents specifically, the semantic layer:
- Translates natural language to governed queries. The agent asks for "revenue." The semantic layer knows that means
SUM(amount) WHERE status != 'refunded' AND type != 'trial'. The agent never writes this SQL itself. - Enforces multi-tenancy and access control. Every query runs through row-level security rules defined in the schema. The agent physically cannot return unauthorized data.
- Provides an API, not a database connection. The agent calls a query endpoint or MCP tool, not a database driver. The semantic layer generates the SQL, executes it, and returns structured results.
The term "agentic semantic layer" describes a semantic layer built for this use case: agent-native interfaces (MCP, tool-use APIs), multi-tenant by default, designed for programmatic access rather than human-driven BI.
How it works: metrics as code
Define your metrics in YAML. Version them in Git. Review changes in pull requests. This is the same workflow your engineering team uses for application code, applied to data definitions.
cubes:
- name: orders
sql_table: public.orders
measures:
- name: total_revenue
sql: "CASE WHEN status != 'refunded' AND type != 'trial' THEN amount ELSE 0 END"
type: sum
- name: order_count
type: count
dimensions:
- name: status
sql: status
type: string
- name: created_at
sql: created_at
type: time
- name: customer_id
sql: customer_id
type: string
total_revenue isn't a column. It's a calculation with your business rules baked in. When finance decides to exclude a new edge case, one diff updates the definition for every consumer. No agent retrained. No dashboard patched. No Slack thread asking "which number is right."
Expose this to AI agents via MCP and the agent discovers available metrics at runtime:
bon deploy
bon mcp
The agent calls explore_schema to see what metrics exist, then query to fetch data. It never generates SQL. It never interprets column names. It queries governed definitions.
For the full setup walkthrough, see How to Connect an AI Agent to Your Data Warehouse.
Agentic semantic layer vs traditional BI semantic layer
Not every semantic layer works well with AI agents. Most were built for BI tools and retrofitted. The difference matters in production.
| Capability | Traditional BI semantic layer | Agentic semantic layer |
|---|---|---|
| Primary consumer | Dashboards, analysts | AI agents, LLMs, applications |
| Interface | SQL or proprietary query language | MCP, REST API, SDK |
| Multi-tenancy | Afterthought or manual | Built-in, per-query enforcement |
| Access control | Dashboard-level | Row-level, per-consumer |
| Discovery | Human browses catalog | Agent calls explore_schema at runtime |
| Caching | Cube-level | Pre-aggregation with automatic invalidation |
| Deployment | UI-driven | CLI, Git, CI/CD |
| Schema management | GUI editor | YAML in version control |
The core difference: an agentic semantic layer treats programmatic access as the default, not a secondary integration. MCP support, multi-tenant keys, and row-level security aren't add-ons. They're the foundation.
What to look for when evaluating
If you're choosing a semantic layer for AI agent use cases, here's what matters:
MCP or tool-use support. Your agents need a standardized way to discover metrics and query them. MCP (Model Context Protocol) is the emerging standard. Without it, you're writing custom integration code for every agent.
Multi-tenancy. If you're building a B2B product, every agent query needs to be scoped to a specific tenant. This should be structural, not a prompt injection.
Row-level security. Beyond tenant scoping, you need fine-grained access control. Marketing agents see marketing data. Finance agents see finance data. Defined in the schema, enforced on every query.
Pre-aggregation. AI agents make more queries than humans. Sub-second response times require cached rollups, not full table scans on every request. Look for configurable pre-aggregation with automatic cache invalidation.
Warehouse coverage. Your semantic layer needs to support your warehouse. At minimum: Snowflake, BigQuery, Databricks, PostgreSQL. Bonus: ClickHouse, DuckDB, Redshift.
Schema-as-code. Metric definitions should live in version control. Changes should go through pull requests. Rollbacks should be git revert, not clicking through a UI.
Several tools in this space: Cube pioneered the open-source semantic layer. AtScale and dbt offer semantic layer capabilities for different stacks. Bonnard is built agent-native from the ground up with MCP as a core feature. The right choice depends on your stack, your use case, and whether you need the semantic layer to serve AI agents as its primary consumer or as a secondary integration.
Where this is heading
The a16z team wrote recently that data agents are "essentially useless without the right context." They're right. But context isn't a feature you add later. It's the layer you build on.
The companies shipping agentic analytics today are the ones that defined their metrics before connecting their agents. The ones struggling are the ones that gave agents raw warehouse access and are now debugging why different tools return different numbers.
A semantic layer isn't optional infrastructure for AI agents. It's the control plane that makes everything downstream trustworthy.
Frequently asked questions
Do I need a semantic layer if I already use dbt?
dbt defines transformations: how raw data becomes clean tables. A semantic layer defines metrics: how clean tables become business numbers. They're complementary. dbt gets your data into the right shape. The semantic layer defines what "revenue" means on top of that shape. Most semantic layers, including Bonnard, can import dbt models directly.
What's the difference between a semantic layer and RAG?
RAG (Retrieval-Augmented Generation) feeds documents to an LLM for context. A semantic layer feeds governed metric definitions to an agent for data queries. RAG is for unstructured knowledge ("What does our refund policy say?"). A semantic layer is for structured data ("What was Q1 revenue?"). You likely need both, but they solve different problems.
Can I use a semantic layer with Claude, GPT, and open-source models?
Yes. A semantic layer with MCP support works with any MCP-compatible client: Claude Desktop, Cursor, Claude Code, and others. For non-MCP agents, most semantic layers expose REST APIs or SDKs. The semantic layer is model-agnostic because it sits between the agent and the warehouse, not inside the model.
How is this different from giving agents read-only database access?
Read-only access prevents writes but doesn't prevent incorrect reads. The agent still interprets column names, guesses JOIN conditions, and invents filter logic. A semantic layer replaces interpretation with definition. The agent queries total_revenue and gets the exact calculation your finance team agreed on, every time.
What's the performance impact of adding a semantic layer?
With pre-aggregation, queries typically get faster, not slower. The semantic layer caches rollups so agents query pre-computed results instead of running full aggregations on every request. Cold queries hit the warehouse directly. Hot queries resolve in single-digit milliseconds.
From zero to PoC in a week.
Define your metrics, plug in your agent, and ship governed AI analytics to every customer.