How to Connect an AI Agent to Your Data Warehouse

Define metrics in YAML, expose them via MCP, and let AI agents query governed data instead of raw SQL. Full tutorial with Bonnard in under 30 minutes.

10 min read

Most teams connecting AI agents to their data warehouse start with text-to-SQL. The agent generates SQL from natural language, runs it against the warehouse, and returns results. It works until it doesn't: hallucinated JOINs, inconsistent aggregations, no access control, no audit trail.

There's a better approach. Define your business metrics in a semantic layer, expose them via MCP (Model Context Protocol), and let any AI agent query governed definitions instead of raw tables. This tutorial shows how to set it up with Bonnard in under 30 minutes.

Why does text-to-SQL break in production?

The agent sees column names but not business logic. It doesn't know that your company excludes refunds from revenue. It doesn't know that status = 'completed' means something different in orders than in subscriptions. It doesn't know that marketing and finance defined "active user" differently three years ago and never reconciled.

So the agent writes plausible SQL and returns plausible numbers. Ask the same question twice with different phrasing and you get different answers. Ask two different agents and you get two different numbers. Neither matches the number your finance team reports.

Beyond consistency, there's no row-level security. No multi-tenancy. No audit trail showing which agent queried what, when, and for whom. In production, with real customers, that's a non-starter.

Text-to-SQL gives you speed. It doesn't give you trust.

What is the semantic layer approach?

Instead of letting agents write arbitrary SQL, define your metrics once in YAML: cubes, measures, dimensions, access rules. Then expose those definitions via MCP so agents query governed metrics, not raw tables.

The difference: every agent gets the same answer because the metric definition is fixed. total_revenue isn't a column the agent interprets. It's a pre-defined calculation with agreed-upon filters and aggregations. When your finance team updates the revenue definition to exclude trial conversions, that change propagates to every consumer instantly. No agent retrained. No dashboard patched. One diff in your schema repo.

This architecture also decouples the query interface from the warehouse dialect. Swap BigQuery for Snowflake and your agents don't notice. The semantic layer abstracts the SQL generation, so consumers stay stable while infrastructure evolves underneath.

Approach	Governance	Consistency	Multi-tenant	Access Control	Setup Time
Direct SQL	None	Varies by query	Manual	Manual	Minutes
Text-to-SQL	None	Varies by prompt	Manual	Manual	Hours
Semantic Layer via MCP	Full	Guaranteed	Built-in	Row-level	Under 30 min

The semantic layer is the control plane between your warehouse and every consumer, whether that's a human analyst, a React component, or an AI agent.

Step 1: Install and initialize

Cloud:

npm install -g @bonnard/cli
bon init

Self-hosted:

npm install -g @bonnard/cli
npx @bonnard/cli init --self-hosted
docker compose up -d

npm install -g @bonnard/cli installs the Bonnard CLI globally. bon init scaffolds a new project for Bonnard Cloud and generates agent configs (rules and skills for Claude Code, Cursor, and Codex) so AI coding assistants understand your semantic layer from the first prompt. npx @bonnard/cli init --self-hosted scaffolds for self-hosted deployment, and docker compose up -d starts the local server. From here, use Claude Code with the /bonnard-get-started skill to walk through configuration and connect your warehouse.

Bonnard supports BigQuery, Snowflake, Redshift, Databricks, PostgreSQL (including Supabase, Neon, and RDS), and DuckDB (including MotherDuck) out of the box. You configure your warehouse connection in bonnard.yml or pass credentials via environment variables.

If you want to explore without connecting your own warehouse, run:

bon datasource add --demo

This provisions a sample PostgreSQL database with orders, customers, and products data so you can follow along with the rest of this tutorial. The demo dataset includes roughly 10,000 rows across three tables, enough to test aggregations, filters, and multi-dimensional queries.

Step 2: Define your metrics

Create a cube that maps to a table in your warehouse and defines the metrics your agents will query.

cubes:
  - name: orders
    sql_table: public.orders
    measures:
      - name: total_revenue
        sql: amount
        type: sum
      - name: count
        type: count
    dimensions:
      - name: status
        sql: status
        type: string
      - name: created_at
        sql: created_at
        type: time

Measures are the numbers you aggregate: sums, counts, averages. Dimensions are the columns you filter and group by: status, date, category. One definition. Every tool, dashboard, and AI agent that queries total_revenue gets the same number.

You can also define pre_aggregations in the same file to cache expensive computations. For example, a daily rollup of total_revenue by status can cut query times from seconds to single-digit milliseconds. Bonnard rebuilds these rollups on a configurable schedule and invalidates stale caches automatically.

Schemas are version-controlled alongside your application code. Review metric changes in pull requests. Roll back a bad definition with git revert. Your data contracts get the same CI/CD workflow as your product.

To control what's exposed to specific consumers, define a view:

views:
  - name: order_metrics
    cubes:
      - join_path: orders
        includes:
          - total_revenue
          - count
          - status
          - created_at

Views act as a curated interface. Your agents see order_metrics with four fields instead of navigating the full schema.

Step 3: Deploy and connect your agent

Preview and deploy your schema:

bon diff     # see what changed before deploying
bon deploy

Run bon mcp to output the MCP connection config for your deployment. Use bon schema to explore deployed measures, dimensions, and views from the CLI, or bon pull to download the deployed models to your local project. Then add the MCP server URL to your client's config.

Cloud (OAuth 2.0 with PKCE, auto-discovery):

{
  "mcpServers": {
    "bonnard": {
      "type": "http",
      "url": "https://mcp.bonnard.dev/mcp"
    }
  }
}

Self-hosted (Bearer token):

{
  "mcpServers": {
    "bonnard": {
      "url": "https://bonnard.example.com/mcp",
      "headers": {
        "Authorization": "Bearer your-secret-token-here"
      }
    }
  }
}

Paste this into your MCP client's config file and restart. For Claude Desktop, that's ~/Library/Application Support/Claude/claude_desktop_config.json. For Cursor, it's .cursor/mcp.json in your project root. Claude Code reads from .mcp.json in your project directory. Run bon mcp test to verify the connection.

The MCP server handles tool discovery, schema introspection, and query execution over HTTP. Your agent sees available metrics the same way it sees any other MCP tool. No custom integration code required.

For customer-facing use cases, create publishable keys per tenant in Settings > API Keys in the Bonnard dashboard. Keys use the bon_pk_... format. Each key scopes the connection to a specific tenant's data with row-level security enforced automatically. Keys can be rotated or revoked from the dashboard without redeploying your schema.

Step 4: Query from your agent

Once connected, your AI agent can discover and query your metrics using natural language. Behind the scenes, the agent calls MCP tools.

Ask: "What's our total revenue this quarter?"

The agent calls explore_schema to discover available metrics, then calls query with the right measures and time filters. The response comes back as structured data, not raw SQL results.

Ask: "Break down order count by status for the last 30 days."

Same flow. The agent uses query with count as the measure, status as the dimension, and a date filter on created_at.

Bonnard exposes five MCP tools:

explore_schema: Discover available cubes, measures, and dimensions
query: Fetch aggregated data using governed metric definitions
sql_query: Run queries for cases that need custom SQL (still governed by access controls)
describe_field: Get metadata about a specific measure or dimension
visualize: Render charts directly in the conversation from query results

Every query runs through the semantic layer. The agent never touches raw tables. Read more about the architecture in our agentic analytics guide.

What do you get beyond MCP?

Bonnard isn't only an MCP server. The same semantic layer powers multiple surfaces:

React SDK: Drop BarChart, LineChart, BigValue, DashboardViewer, and BonnardChart components into your product with @bonnard/react
Markdown dashboards: Author dashboards in markdown, deploy with bon deploy, per-tenant access built in
REST API and TypeScript SDK: Type-safe queries from @bonnard/sdk for custom integrations
Pre-aggregation caching: Define rollups in YAML for sub-second query performance at scale
RBAC and audit logging: Track every query by agent, app, or user with full access control

One schema definition. Every consumer gets governed, consistent data. This matters most at scale: teams using Bonnard report cutting metric discrepancies across tools to zero, because there is only one source of truth. Read more about embedded analytics use cases.

From raw SQL to governed metrics

You started with a warehouse and an AI agent that writes its own SQL. Now you have governed metrics defined in YAML, exposed via MCP, queryable from any AI tool your team or customers use. Setup took under 30 minutes.

The shift from raw SQL to governed metrics pays off immediately: consistent numbers across every consumer, row-level access control per tenant, and a full audit trail for every query. As your team adds more agents and surfaces, the semantic layer scales with you. No duplicate logic. No drift between dashboards and AI answers.

Self-host free under Apache 2.0 or use Bonnard Cloud for managed infrastructure. Check the pricing page for details on Cloud plans and self-hosted support tiers. The full source is on GitHub. Read the documentation to go deeper.

Frequently asked questions

Do I need to retrain my AI model?

No. Bonnard works with any MCP-compatible AI agent out of the box. Your agent discovers available metrics through the MCP protocol at runtime. There is no fine-tuning, prompt engineering, or model modification required.

What data warehouses does Bonnard support?

Bonnard connects to BigQuery, Snowflake, Redshift, Databricks, PostgreSQL (including Supabase, Neon, and RDS), and DuckDB (including MotherDuck). You configure the connection in bonnard.yml or via environment variables. Adding a new warehouse takes a single CLI command.

How is this different from text-to-SQL?

Text-to-SQL lets an AI agent generate arbitrary SQL from natural language. It has no governance, no consistency guarantees, and no access control. Bonnard's semantic layer defines metrics once in YAML and exposes them as governed APIs. Every consumer gets the same answer because the calculation is fixed, not interpreted per query.

Is Bonnard open source?

Yes. Bonnard is licensed under Apache 2.0 and the full source is available on GitHub. You can self-host with no license fees. Bonnard Cloud is available for teams that want managed infrastructure.

How long does setup take?

Under 30 minutes for most teams. Install the CLI, run bon init, connect your warehouse, define a few metrics in YAML, and run bon deploy. The tutorial above walks through each step with working code examples.

Ready to ship a customer-ready MCP?

Turn your semantic layer, dbt, or warehouse into a governed, per-customer MCP for your customers' agents.