https://github.com/cablate/memory-lancedb-mcp
MCP server for LanceDB-backed long-term memory with hybrid retrieval (Vector + BM25), cross-encoder rerank, multi-scope isolation, and memory lifecycle management
https://github.com/cablate/memory-lancedb-mcp
ai-memory hybrid-retrieval lancedb long-term-memory mcp-server model-context-protocol semantic-search typescript vector-database
Last synced: 2 months ago
JSON representation
MCP server for LanceDB-backed long-term memory with hybrid retrieval (Vector + BM25), cross-encoder rerank, multi-scope isolation, and memory lifecycle management
- Host: GitHub
- URL: https://github.com/cablate/memory-lancedb-mcp
- Owner: cablate
- License: mit
- Created: 2026-03-18T05:34:37.000Z (3 months ago)
- Default Branch: master
- Last Pushed: 2026-03-27T02:50:24.000Z (3 months ago)
- Last Synced: 2026-04-02T04:15:42.808Z (3 months ago)
- Topics: ai-memory, hybrid-retrieval, lancedb, long-term-memory, mcp-server, model-context-protocol, semantic-search, typescript, vector-database
- Language: TypeScript
- Size: 1.09 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG-v1.1.0.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README

**Persistent, intelligent long-term memory for any MCP-compatible AI agent.**
[](https://www.npmjs.com/package/@cablate/memory-lancedb-mcp)
[](https://www.npmjs.com/package/@cablate/memory-lancedb-mcp)
[](https://lancedb.com)
[](LICENSE)
**English** | [繁體中文](README_ZH.md)
---
## Before / After
Without memory, every session starts from zero. With memory-lancedb-mcp, your agent accumulates knowledge across sessions — automatically.
**Before** — agent has no context:
```
User: "Use the same animation style as last time"
Agent: "I don't have any context about previous animations. Could you describe what you'd like?"
```
**After** — agent recalls past decisions:
```xml
1. Remotion spring animation: use duration >= 20, damping 12-15 for smooth easing
2. Video export preset: 1080p, 30fps for social, 60fps for demo
#1=6352a7d2 #2=bed148f0
```
Store responses are minimal — no noise, just confirmation:
```
Stored. [topic: remotion]
```
---
## Quick Start
### 1. Install
```bash
npm install -g @cablate/memory-lancedb-mcp
```
### 2. Configure
Add to your MCP client settings (e.g. Claude Desktop `claude_desktop_config.json`):
```json
{
"mcpServers": {
"memory": {
"command": "npx",
"args": ["-y", "@cablate/memory-lancedb-mcp"],
"env": {
"EMBEDDING_API_KEY": "your-api-key",
"EMBEDDING_MODEL": "text-embedding-3-small"
}
}
}
}
```
Advanced: use a config file for full control
```json
{
"mcpServers": {
"memory": {
"command": "npx",
"args": ["-y", "@cablate/memory-lancedb-mcp"],
"env": {
"MEMORY_LANCEDB_CONFIG": "/path/to/config.json"
}
}
}
}
```
See [config.example.json](config.example.json) for all options.
---
## How It Works
```
store recall
│ │
┌────────▼────────┐ ┌────────▼────────┐
│ Filter junk │ │ Search by meaning │
│ Save + embed │ │ AND keywords │
│ Link related │ │ Re-rank results │
│ Flag conflicts │ │ Fade stale ones │
│ Tag topic │ │ Pull in related │
└────────┬────────┘ │ Merge duplicates │
│ └────────┬────────┘
▼ ▼
┌─────────────────────────────────────────────┐
│ LanceDB (local, zero-config) │
└─────────────────────────────────────────────┘
```
Every `memory_store` saves to a local database, automatically links related memories, flags contradictions, and assigns topic labels — no extra API calls needed. Every `memory_recall` searches by both meaning and keywords, pulls in related memories the main search might miss, and includes maintenance hints so the agent can keep its own knowledge base clean.
---
## Features
### Retrieval
- **Finds the right memory even when you use different words** — searches by meaning and exact keywords simultaneously, then combines the best of both
- **More precise results, not just surface matches** — an optional second pass re-ranks results by actual relevance (6 providers supported)
- **Search multiple topics at once** — pass a `queries` array to search several keywords in one call; results are deduplicated and memories that match multiple queries rank higher
- **Finding A automatically surfaces related B** — when a memory is found, its linked neighbors are pulled in too, even if they use completely different words
- **Minimal token overhead** — responses use compact XML tags (``, ``, ``) with short IDs, no category/scope noise
### Storage
- **Related memories link themselves** — when you store something new, it automatically creates bidirectional links to similar existing memories
- **Conflicts get flagged** — if a new memory contradicts an existing one, you get a warning so nothing silently overwrites
- **Topics assigned automatically** — each memory gets a topic label inferred from its content and neighbors; you can also set it explicitly
- **Junk gets filtered out** — greetings, refusals, and meta-questions are rejected before they waste storage
### Lifecycle
- **Frequently used memories stay sharp, stale ones fade** — a decay model balances how recent, how often accessed, and how important each memory is
- **Memories earn their keep** — three tiers (Peripheral → Working → Core); the more a memory gets used, the faster it promotes
- **Full version history** — when you update a memory, the old version is preserved in a chain you can trace with `memory_history`
### Maintenance
- **The agent maintains itself** — recall results include inline hints about duplicates, dormant memories, and contradictions
- **Health checks on demand** — `memory_lint` finds orphaned memories, stale entries, and missing links, then fixes what it can
- **Merge duplicates** — `memory_merge` combines two redundant memories into one; originals are marked as superseded
- **See your memory space** — `memory_visualize` generates an interactive HTML graph you can open in any browser
---
## Visualization
Run `memory_visualize` to generate an interactive knowledge graph of your memory space:
- Automatic clustering — related memories group together visually
- Similarity edges, duplicate detection, importance sizing
- Time filter, growth animation, cluster view
- Self-contained HTML — open in any browser
---
Scoring Pipeline (technical details)
```
Query → embedQuery() ─┐
├─→ RRF Fusion → Rerank → Lifecycle Decay → Length Norm → Filter
Query → BM25 FTS ─────┘
```
| Stage | Effect |
|-------|--------|
| **RRF Fusion** | Combines semantic and exact-match recall |
| **Cross-Encoder Rerank** | Promotes semantically precise hits |
| **Lifecycle Decay** | Weibull freshness + access frequency + importance |
| **Length Normalization** | Prevents long entries from dominating (anchor: 500 chars) |
| **Hard Min Score** | Removes irrelevant results (default: 0.35) |
| **MMR Diversity** | Cosine similarity > 0.85 → demoted |
---
## Configuration
### Environment Variables
| Variable | Required | Description |
|----------|----------|-------------|
| `EMBEDDING_API_KEY` | Yes | API key for embedding provider |
| `EMBEDDING_MODEL` | No | Model name (default: `text-embedding-3-small`) |
| `EMBEDDING_BASE_URL` | No | Custom base URL for non-OpenAI providers |
| `MEMORY_DB_PATH` | No | LanceDB storage directory |
| `MEMORY_LANCEDB_CONFIG` | No | Path to JSON config file |
Full configuration example
```json
{
"embedding": {
"apiKey": "${EMBEDDING_API_KEY}",
"model": "jina-embeddings-v5-text-small",
"baseURL": "https://api.jina.ai/v1",
"dimensions": 1024,
"taskQuery": "retrieval.query",
"taskPassage": "retrieval.passage",
"normalized": true
},
"dbPath": "./memory-data",
"retrieval": {
"mode": "hybrid",
"vectorWeight": 0.7,
"bm25Weight": 0.3,
"minScore": 0.3,
"rerank": "cross-encoder",
"rerankApiKey": "${JINA_API_KEY}",
"rerankModel": "jina-reranker-v3",
"rerankEndpoint": "https://api.jina.ai/v1/rerank",
"rerankProvider": "jina",
"candidatePoolSize": 20,
"hardMinScore": 0.35,
"filterNoise": true
},
"enableManagementTools": true,
"enableSelfImprovementTools": false,
"enableVisualizationTools": true,
"scopes": {
"default": "global",
"definitions": {
"global": { "description": "Shared knowledge" },
"agent:my-bot": { "description": "Private to my-bot" }
},
"agentAccess": {
"my-bot": ["global", "agent:my-bot"]
}
},
"decay": {
"recencyHalfLifeDays": 30,
"frequencyWeight": 0.3,
"intrinsicWeight": 0.3
}
}
```
Embedding providers
Works with **any OpenAI-compatible embedding API**:
| Provider | Model | Base URL | Dimensions |
|----------|-------|----------|------------|
| **OpenAI** | `text-embedding-3-small` | `https://api.openai.com/v1` | 1536 |
| **Jina** | `jina-embeddings-v5-text-small` | `https://api.jina.ai/v1` | 1024 |
| **DeepInfra** | `Qwen/Qwen3-Embedding-8B` | `https://api.deepinfra.com/v1/openai` | 1024 |
| **Google Gemini** | `gemini-embedding-001` | `https://generativelanguage.googleapis.com/v1beta/openai/` | 3072 |
| **Ollama** (local) | `nomic-embed-text` | `http://localhost:11434/v1` | _varies_ |
Rerank providers
| Provider | `rerankProvider` | Endpoint | Example Model |
|----------|-----------------|----------|---------------|
| **Jina** | `jina` | `https://api.jina.ai/v1/rerank` | `jina-reranker-v3` |
| **Hugging Face TEI** | `tei` | `http://host:8081/rerank` | `BAAI/bge-reranker-v2-m3` |
| **SiliconFlow** | `siliconflow` | `https://api.siliconflow.com/v1/rerank` | `BAAI/bge-reranker-v2-m3` |
| **Voyage AI** | `voyage` | `https://api.voyageai.com/v1/rerank` | `rerank-2.5` |
| **Pinecone** | `pinecone` | `https://api.pinecone.io/rerank` | `bge-reranker-v2-m3` |
| **DashScope** | `dashscope` | `https://dashscope.aliyuncs.com/api/v1/services/rerank` | `gte-rerank` |
---
Tools Reference
### Core Tools
| Tool | Description |
|------|-------------|
| `memory_recall` | Search memories — supports batch queries, relation expansion, topic filtering, and inline maintenance hints |
| `memory_store` | Save a memory — auto-links related ones, flags contradictions, infers topic, filters junk |
| `memory_forget` | Delete by ID or search query |
| `memory_update` | Update a memory; the old version is preserved in a version chain |
| `memory_merge` | Merge two memories into one |
| `memory_history` | Trace version history through update/merge chains |
### Management Tools (opt-in)
| Tool | Description |
|------|-------------|
| `memory_stats` | Usage statistics by scope and category |
| `memory_list` | List recent memories with filtering |
| `memory_lint` | Health checks + auto-fix missing relations |
Enable: `"enableManagementTools": true`
### Self-Improvement Tools (opt-in)
| Tool | Description |
|------|-------------|
| `self_improvement_log` | Log structured learning/error entries |
| `self_improvement_extract_skill` | Create skill scaffolds from learnings |
| `self_improvement_review` | Summarize governance backlog |
Enable: `"enableSelfImprovementTools": true`
### Visualization Tools (on by default)
| Tool | Description |
|------|-------------|
| `memory_visualize` | Generate interactive HTML memory graph |
Params: `output_path`, `scope`, `threshold` (default: 0.65), `max_neighbors` (default: 4)
Disable: `"enableVisualizationTools": false`
---
Database Schema
LanceDB table `memories`:
| Field | Type | Description |
|-------|------|-------------|
| `id` | string (UUID) | Primary key |
| `text` | string | Memory text (FTS indexed) |
| `vector` | float[] | Embedding vector |
| `category` | string | `preference` / `fact` / `decision` / `entity` / `skill` / `lesson` / `other` |
| `scope` | string | Scope identifier |
| `importance` | float | Importance score 0-1 |
| `timestamp` | int64 | Creation timestamp (ms) |
| `metadata` | string (JSON) | Extended metadata (tier, access_count, relations, topic, etc.) |
---
## Development
```bash
git clone https://github.com/cablate/memory-lancedb-mcp.git
cd memory-lancedb-mcp
npm install
npm test
```
Run locally:
```bash
EMBEDDING_API_KEY=your-key npx tsx server.ts
```
---
## Credits
Built on [CortexReach/memory-lancedb-pro](https://github.com/CortexReach/memory-lancedb-pro) — original work by [win4r](https://github.com/win4r) and contributors.
## License
MIT — see [LICENSE](LICENSE) for details.