https://github.com/dumkydewilde/mcp-memory-layer
A template for building your own BI MCP with dbt, LLMs and multi-user corrections
https://github.com/dumkydewilde/mcp-memory-layer
bi data dbt llm mcp-server
Last synced: 4 months ago
JSON representation
A template for building your own BI MCP with dbt, LLMs and multi-user corrections
- Host: GitHub
- URL: https://github.com/dumkydewilde/mcp-memory-layer
- Owner: dumkydewilde
- Created: 2026-02-08T22:26:34.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2026-03-11T09:37:29.000Z (4 months ago)
- Last Synced: 2026-03-11T16:34:54.105Z (4 months ago)
- Topics: bi, data, dbt, llm, mcp-server
- Language: Python
- Homepage:
- Size: 272 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# MCP Memory Layer for Text-to-SQL
An [MCP](https://modelcontextprotocol.io/) server that wraps a DuckDB/MotherDuck data warehouse with a **memory layer** — corrections, dbt model context, and query popularity tracking — so LLMs write better SQL on the first try.
Comes with a ready-to-run **jaffle_shop** example and an **evaluation framework** for A/B testing memory features.
## The problem
LLMs generating SQL against a data warehouse hit the same mistakes over and over:
- **Naming traps** — `raw_orders.customer` vs `stg_orders.customer_id`, amounts in cents vs dollars
- **Stale tables** — denormalized snapshots that look useful but have incomplete data
- **Missing business context** — column descriptions, upstream lineage, and pre-computed metrics that the raw schema doesn't reveal
- **Reinventing joins** — ignoring common access patterns that dbt already optimized
Each mistake costs a round-trip: the LLM writes bad SQL, gets an error, tries again. With complex schemas (100+ models), these round-trips add up fast.
## How it works
The memory layer sits between the LLM and the database, providing three types of context that raw schema metadata can't:
### 1. Corrections
A version-controlled JSON file of schema "gotchas" — things an LLM can't infer from `DESCRIBE` alone:
```
"raw_orders amounts (subtotal, tax_paid, order_total) are stored in CENTS.
Use stg_orders which converts to dollars."
```
```
"AVOID the daily_revenue table — it is a stale snapshot covering only
3 of 6 locations. Use the orders table instead."
```
Corrections are matched to incoming questions by table name, column name, and keyword overlap. The LLM calls `get_corrections` before writing SQL and gets the top 3 relevant tips.
New corrections can be saved during a session (`save_correction`) when the LLM discovers something non-obvious — building institutional knowledge over time.
### 2. dbt model context
Parses the dbt `manifest.json` to serve column descriptions, upstream lineage, tests, and (truncated) SQL — the same curated metadata your analytics engineers wrote, delivered as small token-efficient slices rather than dumping the entire catalog.
Tools: `list_dbt_models`, `get_dbt_context`, `get_model_sql`
### 3. Query popularity tracking
Records which tables, columns, and join patterns are actually used in queries. Over time this builds a picture of common access patterns:
```
orders: queried 47 times
→ commonly joined to customers ON customer_id (INNER, 32x)
Popular columns: order_total (select), ordered_at (where)
```
This steers the LLM toward proven patterns instead of inventing joins from scratch.
## Quick start
The repo includes a complete jaffle_shop example — a DuckDB database with dbt models, pre-seeded corrections, and popularity data.
```bash
uv sync
uv run mcp-memory
```
### Claude Desktop configuration
```json
{
"mcpServers": {
"memory-layer": {
"command": "uv",
"args": ["run", "--directory", "/path/to/mcp-memory-layer", "mcp-memory"]
}
}
}
```
### Environment variables
| Variable | Default | Description |
|---|---|---|
| `MCP_MEMORY_DATA_DIR` | `data/` | Base directory for data files |
| `MCP_MEMORY_DUCKDB_PATH` | `data/jaffle_shop/jaffle_shop.duckdb` | DuckDB database path |
| `MCP_MEMORY_MANIFEST_PATH` | `dbt_project/target/manifest.json` | dbt manifest.json path |
| `MCP_MEMORY_CORRECTIONS_PATH` | `data/corrections.json` | Corrections JSON path |
| `MCP_MEMORY_POPULARITY_DB` | `data/popularity.duckdb` | Popularity tracking database |
| `MCP_MEMORY_CORRECTIONS` | `true` | Enable/disable corrections |
| `MCP_MEMORY_DBT` | `true` | Enable/disable dbt context |
| `MCP_MEMORY_POPULARITY` | `true` | Enable/disable popularity tracking |
## Evaluation framework
The `eval/` directory contains an A/B testing harness that measures how each memory feature affects SQL quality:
```bash
# Compare baseline (no memory) vs all features
uv run python -m eval.harness --config baseline --api openai
uv run python -m eval.harness --config all_features --api openai
# Generate comparison report
uv run python -m eval.report eval/results/baseline.json eval/results/all_features.json
```
Configurations: `baseline`, `corrections`, `dbt`, `popularity`, `all_features`
Questions include "dead-end traps" — stale tables that look correct but produce wrong results. These specifically test whether corrections can prevent the LLM from falling into schema traps.
## Bring your own project
The memory layer works with any DuckDB/dbt project — not just the bundled jaffle_shop demo.
### Quick setup
```bash
# Initialize config directory (~/.mcp-memory/)
mcp-memory-cli init \
--duckdb-path /path/to/your/database.duckdb \
--manifest-path /path/to/your/dbt/target/manifest.json
```
This creates:
- `~/.mcp-memory/config.toml` — paths and feature flags
- `~/.mcp-memory/corrections.json` — empty corrections store (grows as you use it)
Then run the server:
```bash
mcp-memory # reads config from ~/.mcp-memory/config.toml
```
### Connecting your dbt manifest
The `manifest` path supports multiple sources. The server resolves the manifest on startup:
```toml
[paths]
# Local file — point to your dbt project's target directory
manifest = "/path/to/your/dbt/target/manifest.json"
# URL — S3 presigned URL, GCS signed URL, or any HTTP endpoint
manifest = "https://my-bucket.s3.amazonaws.com/dbt/manifest.json"
```
**Local file** is the simplest: run `dbt compile` (or `dbt build`) and point to `target/manifest.json`.
**URL** is useful for teams: upload the manifest to a shared bucket as part of your dbt CI/CD pipeline (e.g. `dbt build && aws s3 cp target/manifest.json s3://...`). The server caches fetched manifests locally and re-fetches when the cache is older than 1 hour.
### Config file
Instead of env vars, you can configure everything in `~/.mcp-memory/config.toml`:
```toml
[paths]
duckdb = "/path/to/your/database.duckdb"
manifest = "/path/to/your/dbt/target/manifest.json"
corrections = "~/.mcp-memory/corrections.json"
# popularity_db = "~/.mcp-memory/popularity.duckdb"
[features]
query = true
corrections = true
dbt = true
popularity = true
```
Precedence: **env vars > config.toml > defaults**. You can mix both — use the config file for stable paths and env vars for overrides.
### What you need
| Component | Required? | Notes |
|-----------|-----------|-------|
| DuckDB database | Yes | Local `.duckdb` file |
| dbt manifest | Recommended | Local file or URL. Without it, `list_dbt_models` and `get_dbt_context` are disabled. |
| corrections.json | No | Starts empty, grows via `save_correction` tool calls |
| popularity seed | No | Popularity tracking auto-populates from real queries |
### Without dbt
If you don't use dbt, set `dbt = false` in config (or `MCP_MEMORY_DBT=false`). The server runs with just the query tool + corrections + popularity tracking. You can still save corrections about your schema and benefit from popularity-based join suggestions.
## Why a semantic/memory layer for MCP?
MCP gives LLMs access to tools. But tools alone aren't enough — an LLM with `execute_query` and `list_tables` will still write bad SQL against an unfamiliar schema because:
1. **Schema metadata is necessary but not sufficient.** Column names and types tell you *what exists*, not *how to use it correctly*. That `amounts are in cents` insight? It's not in the schema. It's in someone's head, a Slack thread, or a dbt description that the LLM never sees.
2. **LLMs don't learn from their mistakes within a session.** If an LLM hits a naming trap in turn 1, it has no mechanism to avoid it in turn 10 (or in the next conversation). Corrections make that learning persistent and shareable.
3. **dbt already solved the documentation problem.** Analytics teams invest heavily in documenting models — column descriptions, tests, lineage. A semantic layer surfaces that work directly to the LLM, rather than having it guess from raw `information_schema`.
4. **Usage patterns encode tribal knowledge.** Which tables do people actually query? What joins work? Popularity tracking captures the implicit knowledge that experienced analysts have but schemas don't express.
The memory layer turns one-shot SQL generation into an iterative, self-improving system. Each session can contribute corrections. Each query refines popularity stats. The more you use it, the better it gets.
## Development
```bash
uv sync # install dependencies
uv run pytest # run tests
uv run ruff check . # lint
```
Requires Python >= 3.11. Uses `sqlglot` for SQL parsing (DuckDB dialect) and `FastMCP` for the MCP server.