https://github.com/never00miss/allan-mcp-memory-code

🧠 Knowledge Graph Memory for AI Coding Agents - Full offline mode with Docker. Integrates with Claude, Cline, Cursor, Windsurf, and more. Auto-extracts entities & relationships. No API keys required.
https://github.com/never00miss/allan-mcp-memory-code
ai-agents claude cline coding cursor docker graphiti knowledge-graph llm mcp memory offline-first plan token token-optimization
Last synced: 2 months ago
JSON representation
Host: GitHub
URL: https://github.com/never00miss/allan-mcp-memory-code
Owner: never00miss
Created: 2026-05-22T10:14:03.000Z (2 months ago)
Default Branch: main
Last Pushed: 2026-05-22T21:14:34.000Z (2 months ago)
Last Synced: 2026-05-22T21:32:00.776Z (2 months ago)
Topics: ai-agents, claude, cline, coding, cursor, docker, graphiti, knowledge-graph, llm, mcp, memory, offline-first, plan, token, token-optimization
Language: JavaScript
Homepage:
Size: 186 KB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
Awesome Lists containing this project

README

          # Allan MCP Memory Code

🧠 Persistent knowledge graph memory for AI coding assistants (Claude, Cline, Cursor, Windsurf). Runs 100% offline with Docker. Auto-extracts entities & relationships from conversations.

## Features

- **Full Offline Mode** - No API keys required, runs entirely on local hardware

- **Cloud Mode** - Use OpenRouter/OpenAI for low-resource machines (~$0.50/month)

- Auto-extract entities + relationships from text

- Hybrid search (text + vector) for nodes and facts

- All-in-one Docker setup (FalkorDB + Ollama + LLM + Embedding)

- Integrates with Claude, Cline, Kilo Code, Cursor, Windsurf, and more

---

## Quick Start (Offline)

**No API keys required!** Everything runs locally.

```bash

# Clone the repository

git clone https://github.com/never00miss/allan-mcp-memory-code.git

cd allan-mcp-memory-code

# Start all services (FalkorDB + Ollama + Models)

docker compose up -d

# First run downloads ~5GB of models (5-15 min depending on internet)

# Wait until models are ready:

docker compose logs ollama-init -f

# When you see "All models ready!", the service is available at:

# http://localhost:19089

```

**That's it!** Now integrate with your AI coding assistant below.

---

## Quick Start (Cloud - OpenRouter)

**Lightweight setup!** Only FalkorDB runs locally, LLM/Embedding via cloud.

```bash

# Clone the repository

git clone https://github.com/never00miss/allan-mcp-memory-code.git

cd allan-mcp-memory-code

# Start only FalkorDB (graph database)

docker compose up falkordb -d

# Configure cloud API

cp .env.example .env

```

Edit `.env` for OpenRouter:

```env

# LLM (OpenRouter)

LLM_API_URL=https://openrouter.ai/api/v1

LLM_API_KEY=sk-or-v1-your-key-here

LLM_MODEL=qwen/qwen-2.5-7b-instruct

# Embedding (OpenRouter or local Ollama)

EMBEDDER_API_URL=https://openrouter.ai/api/v1

EMBEDDER_API_KEY=sk-or-v1-your-key-here

EMBEDDER_MODEL=openai/text-embedding-3-small

# FalkorDB

FALKORDB_URI=redis://localhost:6380

```

```bash

# Install and run

npm install

npm start

# Health check

curl http://localhost:19089/v1/health

```

### OpenRouter Cost Estimate (1 Hour Coding Session)

| Activity | Requests | Tokens/Req | Total Tokens | Cost |

|----------|----------|------------|--------------|------|

| Entity extraction | ~20 | ~500 | ~10,000 | ~$0.002 |

| Embedding generation | ~50 | ~100 | ~5,000 | ~$0.0001 |

| Search queries | ~30 | ~200 | ~6,000 | ~$0.001 |

| **Total per hour** | **~100** | - | **~21,000** | **~$0.003** |

**Monthly estimate (8hr/day, 20 days):** ~$0.50

> 💡 **Tip:** Use `qwen/qwen-2.5-7b-instruct` ($0.00018/1K tokens) - extremely cheap and reliable for entity extraction.

### Recommended Cloud Models

| Provider | Model | Cost/1K tokens | Notes |

|----------|-------|----------------|-------|

| OpenRouter | `qwen/qwen-2.5-7b-instruct` | $0.00018 | **Best value** |

| OpenRouter | `google/gemma-3-4b-it` | $0.00010 | Cheapest |

| OpenRouter | `openai/gpt-4o-mini` | $0.00015 | High quality |

| OpenAI | `gpt-4o-mini` | $0.00015 | Direct API |

---

## Integration

### Claude Code (MCP Server) ⭐ Recommended

Use MCP tools directly in Claude Code - shows up in `/mcp` command.

> **Important:** Always use `claude mcp add` to configure the MCP server. Do **NOT** manually edit `~/.claude/settings.json` or `~/.claude/mcp.json` for MCP config. Claude Code reads MCP settings from `~/.claude.json` (managed by the CLI), not from those files. Editing the wrong file will cause "Connection closed" errors.

---

#### Step 1: Prerequisites

```bash

# Start FalkorDB

docker compose up falkordb -d

# Install & build allan-memory

cd allan-mcp-memory-code

npm install

npm run build

npm link

```

#### Step 2: Setup Claude Code Config

Copy the ready-made config files from `extensions/`:

```bash

# Copy hooks + CLAUDE.md

cp -r extensions/.claude ~/.claude

chmod +x ~/.claude/hooks/observe-read.sh ~/.claude/hooks/observe-edit.sh

```

Then edit the hook scripts — replace the placeholder env vars with your actual values:

| Cloud (OpenRouter) | Local (Ollama) |

|---|---|

| `LLM_API_URL=https://openrouter.ai/api/v1` | `LLM_API_URL=http://localhost:11435/v1` |

| `LLM_API_KEY=sk-or-v1-your-key-here` | `LLM_API_KEY=ollama` |

| `LLM_MODEL=qwen/qwen-2.5-7b-instruct` | `LLM_MODEL=qwen2.5:7b-instruct` |

| `EMBEDDER_API_URL=https://openrouter.ai/api/v1` | `EMBEDDER_API_URL=http://localhost:11435/v1` |

| `EMBEDDER_API_KEY=sk-or-v1-your-key-here` | `EMBEDDER_API_KEY=ollama` |

| `EMBEDDER_MODEL=openai/text-embedding-3-small` | `EMBEDDER_MODEL=nomic-embed-text` |

Add the `hooks` section to `~/.claude/settings.json`:

```json

{

  "hooks": {

    "UserPromptSubmit": [

      {

        "description": "Inject project-specific context",

        "hooks": [

          {

            "type": "command",

            "command": "echo 'see allan-memory too and always do remember save'"

          }

        ]

      }

    ],

    "PostToolUse": [

      {

        "matcher": "Read",

        "hooks": [

          {

            "type": "command",

            "command": "/Users/YOUR_USER/.claude/hooks/observe-read.sh",

            "timeout": 30

          }

        ]

      },

      {

        "matcher": "Edit|Write|MultiEdit",

        "hooks": [

          {

            "type": "command",

            "command": "/Users/YOUR_USER/.claude/hooks/observe-edit.sh",

            "timeout": 60

          }

        ]

      }

    ]

  }

}

```

> Replace `/Users/YOUR_USER` with your actual home directory path.

The `UserPromptSubmit` hook prepends a short reminder to every prompt so the assistant keeps using `recall` / `remember` instead of falling back to raw file search. It runs locally — no network calls, no secrets.

#### Step 3: Add MCP Server

**Local (Ollama):**

```bash

claude mcp add allan-memory \

  -e FALKORDB_URI=redis://localhost:6380 \

  -e LLM_API_URL=http://localhost:11435/v1 \

  -e LLM_API_KEY=ollama \

  -e LLM_MODEL=qwen2.5:7b-instruct \

  -e EMBEDDER_API_URL=http://localhost:11435/v1 \

  -e EMBEDDER_API_KEY=ollama \

  -e EMBEDDER_MODEL=nomic-embed-text \

  -- node /full/path/to/allan-mcp-memory-code/dist/mcp-server.js

```

**Cloud (OpenRouter):**

```bash

claude mcp add allan-memory \

  -e FALKORDB_URI=redis://localhost:6380 \

  -e FALKORDB_GRAPH_NAME=allan_memory \

  -e LLM_API_URL=https://openrouter.ai/api/v1 \

  -e LLM_API_KEY=sk-or-v1-your-key-here \

  -e LLM_MODEL=qwen/qwen-2.5-7b-instruct \

  -e EMBEDDER_API_URL=https://openrouter.ai/api/v1 \

  -e EMBEDDER_API_KEY=sk-or-v1-your-key-here \

  -e EMBEDDER_MODEL=openai/text-embedding-3-small \

  -- node /full/path/to/allan-mcp-memory-code/dist/mcp-server.js

```

**Hybrid (Local LLM + Cloud Embedding):**

```bash

claude mcp add allan-memory \

  -e FALKORDB_URI=redis://localhost:6380 \

  -e LLM_API_URL=http://localhost:11435/v1 \

  -e LLM_API_KEY=ollama \

  -e LLM_MODEL=qwen2.5:7b-instruct \

  -e EMBEDDER_API_URL=https://openrouter.ai/api/v1 \

  -e EMBEDDER_API_KEY=sk-or-v1-your-key-here \

  -e EMBEDDER_MODEL=openai/text-embedding-3-small \

  -- node /full/path/to/allan-mcp-memory-code/dist/mcp-server.js

```

> Replace `/full/path/to/allan-mcp-memory-code` with your actual path. The path must point to `dist/mcp-server.js`, **not** `lib/mcp-server.js`.

**Remove MCP:**

```bash

claude mcp remove allan-memory

```

---

#### Step 4: Verify

1. **Restart VS Code** completely (Cmd+Q, not just reload window)

2. Type `/mcp` in Claude Code chat

3. You should see `allan-memory` with 6 tools

**Troubleshooting:**

| Issue | Solution |

|-------|----------|

| "Connection closed" error | You likely edited `~/.claude/settings.json` manually. Run `claude mcp remove allan-memory` then re-add with `claude mcp add` |

| "SyntaxError: Cannot use import" | Path points to `lib/` instead of `dist/`. Use `dist/mcp-server.js` |

| MCP shows but tools don't work | FalkorDB not running. Start it: `docker compose up falkordb -d` |

| MCP not showing in `/mcp` | Run `claude mcp list` to verify it's registered |

#### Available MCP Tools

| Tool | Description |

|------|-------------|

| `register_project` | Register project root for path resolution and freshness |

| `remember` | Store a memory with structured fields (type, scope, content) |

| `recall` | Search with inline freshness (check `freshness.stale` on results!) |

| `relate` | Find relationships between entities |

| `list` | Enumerate stored entities by type |

| `refresh` | Re-extract entities from a file (use when recall returns stale) |

Legacy Tools (backward compat)

| Tool | Description |

|------|-------------|

| `add_memory` | Store knowledge (name, content, group_id) |

| `search_nodes` | Search entities by query |

| `search_facts` | Search relationships by query |

| `regenerate_file` | Auto-extract entities from a source file |

---

### Claude Code (HTTP/curl Alternative)

If MCP doesn't work, use HTTP API with curl commands.

#### Step 1: Allow curl Commands

Add to `~/.claude/settings.json`:

```json

{

  "permissions": {

    "allow": [

      "Bash(curl*)",

      "Bash(curl -s*)",

      "Bash(curl -X POST http://localhost:19089*)",

      "Bash(curl http://localhost:19089*)"

    ],

    "deny": []

  }

}

```

#### Step 2: Add Instructions to CLAUDE.md

Copy the ready-made instructions from `extensions/.claude/CLAUDE.md` to `~/.claude/CLAUDE.md`:

```bash

cp extensions/.claude/CLAUDE.md ~/.claude/CLAUDE.md

```

If you have an existing `~/.claude/CLAUDE.md`, append the content manually.

#### Step 3: Restart Claude Code

After editing settings, **restart Claude Code completely** for changes to take effect.

---

### Prompt Tips for Auto-Triggering

To make Claude use memory tools automatically, use these phrases:

| Phrase in Prompt | Triggers |

|------------------|----------|

| "Check your memory first..." | `search_nodes` before answering |

| "What do you remember about...?" | `search_nodes` |

| "Remember this for later..." | `add_memory` |

| "Search your knowledge of..." | `search_nodes` |

| "What patterns have you seen in...?" | `search_facts` |

| "Store this insight..." | `add_memory` |

| "Before answering, check if..." | `search_nodes` |

#### Example Prompts:

```

"Check your memory first, then explain the authentication flow"

"What do you remember about the database schema?"

"Remember this: the API uses JWT tokens with 24h expiry"

"Search your knowledge of error handling patterns in this codebase"

```

#### Troubleshooting

| Issue | Solution |

|-------|----------|

| MCP not showing in `/mcp` | Verify path is absolute, restart VS Code |

| Popup for `curl` commands | Add `"Bash(curl*)"` to `permissions.allow` |

| Server errors | Ensure Docker is running: `docker compose ps` |

---

### OpenClaw

Allan Memory integrates with [OpenClaw](https://github.com/openclaw/openclaw) as a skill.

#### Option 1: Install Skill from GitHub

```bash

openclaw skills install git:never00miss/allan-mcp-memory-code@main --as allan-memory

```

#### Option 2: Copy Skill Locally

Copy the `openclaw-skill/SKILL.md` file to your OpenClaw skills directory:

```bash

mkdir -p ~/.openclaw/skills/allan-memory

cp openclaw-skill/SKILL.md ~/.openclaw/skills/allan-memory/SKILL.md

```

#### Configure MCP Server

Add to `~/.openclaw/openclaw.json`:

```json

{

  "skills": {

    "entries": {

      "allan-memory": {

        "enabled": true,

        "env": {

          "FALKORDB_URI": "redis://localhost:6380"

        }

      }

    }

  },

  "mcpServers": {

    "allan-memory": {

      "command": "node",

      "args": ["/path/to/allan-mcp-memory-code/dist/mcp-server.js"],

      "env": {

        "FALKORDB_URI": "redis://localhost:6380",

        "LLM_API_URL": "https://openrouter.ai/api/v1",

        "LLM_API_KEY": "your-openrouter-key",

        "LLM_MODEL": "qwen/qwen-2.5-7b-instruct",

        "EMBEDDING_API_URL": "https://openrouter.ai/api/v1",

        "EMBEDDING_API_KEY": "your-openrouter-key",

        "EMBEDDING_MODEL": "openai/text-embedding-3-small"

      }

    }

  }

}

```

#### Available Tools in OpenClaw

Once configured, you can use these tools in OpenClaw:

- `add_memory` - Store knowledge (files, functions, patterns)

- `search_nodes` - Find entities by semantic search

- `search_facts` - Find relationships between entities

- `check_freshness` - Verify stored knowledge isn't stale

- `regenerate_file` - Update knowledge after file edits

- `get_episodes` - List recent memory entries

- `delete_episode` - Remove stale entries

See `openclaw-skill/SKILL.md` for detailed usage instructions.

---

### Cline (VS Code)

#### Step 1: Add MCP Server

Add to Cline MCP settings (`cline_mcp_settings.json`):

**Local (Ollama):**

```json

{

  "mcpServers": {

    "allan-memory": {

      "command": "node",

      "args": ["/path/to/allan-mcp-memory-code/dist/mcp-server.js"],

      "env": {

        "FALKORDB_URI": "redis://localhost:6380",

        "LLM_API_URL": "http://localhost:11435/v1",

        "LLM_API_KEY": "ollama",

        "LLM_MODEL": "qwen2.5:7b-instruct",

        "EMBEDDER_API_URL": "http://localhost:11435/v1",

        "EMBEDDER_API_KEY": "ollama",

        "EMBEDDER_MODEL": "nomic-embed-text"

      }

    }

  }

}

```

**Cloud (OpenRouter):**

```json

{

  "mcpServers": {

    "allan-memory": {

      "command": "node",

      "args": ["/path/to/allan-mcp-memory-code/dist/mcp-server.js"],

      "env": {

        "FALKORDB_URI": "redis://localhost:6380",

        "LLM_API_URL": "https://openrouter.ai/api/v1",

        "LLM_API_KEY": "sk-or-v1-your-key-here",

        "LLM_MODEL": "qwen/qwen-2.5-7b-instruct",

        "EMBEDDER_API_URL": "https://openrouter.ai/api/v1",

        "EMBEDDER_API_KEY": "sk-or-v1-your-key-here",

        "EMBEDDER_MODEL": "openai/text-embedding-3-small"

      }

    }

  }

}

```

#### Step 2: Add Custom Instructions

Go to **Cline Settings → Custom Instructions** and add the content from `extensions/.claude/CLAUDE.md`.

Or copy it directly:

```bash

cat extensions/.claude/CLAUDE.md

```

---

### Kilo Code

#### Step 1: Add MCP Server

Add to Kilo Code MCP settings:

**Local (Ollama):**

```json

{

  "servers": {

    "allan-memory": {

      "type": "stdio",

      "command": "node",

      "args": ["/path/to/allan-mcp-memory-code/dist/mcp-server.js"],

      "env": {

        "FALKORDB_URI": "redis://localhost:6380",

        "LLM_API_URL": "http://localhost:11435/v1",

        "LLM_API_KEY": "ollama",

        "LLM_MODEL": "qwen2.5:7b-instruct",

        "EMBEDDER_API_URL": "http://localhost:11435/v1",

        "EMBEDDER_API_KEY": "ollama",

        "EMBEDDER_MODEL": "nomic-embed-text"

      }

    }

  }

}

```

**Cloud (OpenRouter):**

```json

{

  "servers": {

    "allan-memory": {

      "type": "stdio",

      "command": "node",

      "args": ["/path/to/allan-mcp-memory-code/dist/mcp-server.js"],

      "env": {

        "FALKORDB_URI": "redis://localhost:6380",

        "LLM_API_URL": "https://openrouter.ai/api/v1",

        "LLM_API_KEY": "sk-or-v1-your-key-here",

        "LLM_MODEL": "qwen/qwen-2.5-7b-instruct",

        "EMBEDDER_API_URL": "https://openrouter.ai/api/v1",

        "EMBEDDER_API_KEY": "sk-or-v1-your-key-here",

        "EMBEDDER_MODEL": "openai/text-embedding-3-small"

      }

    }

  }

}

```

#### Step 2: Add Custom Instructions

Go to **Kilo Code Settings → Custom Instructions** and add the content from `extensions/.claude/CLAUDE.md`.

Or copy it directly:

```bash

cat extensions/.claude/CLAUDE.md

```

---

### Windsurf (Codeium)

#### Step 1: Add MCP Server

Add to Windsurf MCP settings (`~/.codeium/windsurf/mcp_config.json`):

**Local (Ollama):**

```json

{

  "mcpServers": {

    "allan-memory": {

      "command": "node",

      "args": ["/path/to/allan-mcp-memory-code/dist/mcp-server.js"],

      "env": {

        "FALKORDB_URI": "redis://localhost:6380",

        "LLM_API_URL": "http://localhost:11435/v1",

        "LLM_API_KEY": "ollama",

        "LLM_MODEL": "qwen2.5:7b-instruct",

        "EMBEDDER_API_URL": "http://localhost:11435/v1",

        "EMBEDDER_API_KEY": "ollama",

        "EMBEDDER_MODEL": "nomic-embed-text"

      }

    }

  }

}

```

**Cloud (OpenRouter):**

```json

{

  "mcpServers": {

    "allan-memory": {

      "command": "node",

      "args": ["/path/to/allan-mcp-memory-code/dist/mcp-server.js"],

      "env": {

        "FALKORDB_URI": "redis://localhost:6380",

        "LLM_API_URL": "https://openrouter.ai/api/v1",

        "LLM_API_KEY": "sk-or-v1-your-key-here",

        "LLM_MODEL": "qwen/qwen-2.5-7b-instruct",

        "EMBEDDER_API_URL": "https://openrouter.ai/api/v1",

        "EMBEDDER_API_KEY": "sk-or-v1-your-key-here",

        "EMBEDDER_MODEL": "openai/text-embedding-3-small"

      }

    }

  }

}

```

#### Step 2: Add Global AI Rules

Go to **Windsurf Settings → AI Rules → Global AI Rules** and add the content from `extensions/.claude/CLAUDE.md`.

Or copy it directly:

```bash

cat extensions/.claude/CLAUDE.md

```

---

### Cursor

Create `.cursorrules` in your project root with the content from `extensions/.claude/CLAUDE.md`.

Or copy it directly:

```bash

cp extensions/.claude/CLAUDE.md .cursorrules

```

---

### Continue.dev

#### Step 1: Add Context Provider

Add to `~/.continue/config.json`:

```json

{

  "contextProviders": [

    {

      "name": "http",

      "params": {

        "url": "http://localhost:19089/v1/memory/search/nodes",

        "method": "POST",

        "headers": { "Content-Type": "application/json" },

        "body": { "query": "{{{ input }}}", "limit": 5 },

        "title": "Allan Memory"

      }

    }

  ]

}

```

#### Step 2: Add System Message

Add to `~/.continue/config.json` under `models[].systemMessage`:

Use the system instructions from [`extensions/.claude/CLAUDE.md`](extensions/.claude/CLAUDE.md), adapting the tool commands to use the HTTP API (`curl`) instead of MCP tools.

---

### GitHub Copilot

Create `.github/copilot-instructions.md` using the instructions from [`extensions/.claude/CLAUDE.md`](extensions/.claude/CLAUDE.md), adapting the tool commands to use the HTTP API (`curl`) instead of MCP tools.

---

### Generic HTTP API

| Action | Method | Endpoint | Body |

|--------|--------|----------|------|

| Store | POST | `/v1/memory` | `{"name":"...","episode_body":"...","group_id":"..."}` |

| Search Entities | POST | `/v1/memory/search/nodes` | `{"query":"...","limit":10}` |

| Search Relations | POST | `/v1/memory/search/facts` | `{"query":"...","limit":10}` |

| List Episodes | GET | `/v1/memory/episodes?group_id=...` | - |

| Health Check | GET | `/v1/health` | - |

---

## Exposed Ports

| Port | Service |

|------|---------|

| 19089 | Allan Memory API |

| 6380 | FalkorDB (Redis protocol) |

| 3001 | FalkorDB Web UI |

| 11435 | Ollama API |

---

## Hardware Requirements

### Minimum

| Component | Requirement |

|-----------|-------------|

| **RAM** | 16GB |

| **Storage** | 15GB free |

| **CPU** | 4+ cores |

### Model Sizes

| Model | Download | RAM Usage |

|-------|----------|-----------|

| nomic-embed-text | ~270MB | ~500MB |

| qwen2.5:7b-instruct | ~4.7GB | ~6GB |

| **Total** | **~5GB** | **~6.5GB** |

### Tested Platforms

| Platform | Performance |

|----------|-------------|

| MacBook Pro M2 16GB | ✅ Smooth (~10 tok/s) |

| MacBook Pro M1 16GB | ✅ Good (~8 tok/s) |

| Linux + RTX 3060 | ✅ Fast (~25 tok/s) |

| Linux + RTX 4090 | ✅ Very fast (~40 tok/s) |

---

## Environment Variables

| Variable | Default | Description |

|----------|---------|-------------|

| `PORT` | 19089 | Service port |

| `LLM_API_URL` | http://localhost:11435/v1 | LLM endpoint |

| `LLM_MODEL` | qwen2.5:7b-instruct | LLM model for MCP tools (recall, relate, etc.) |

| `EMBEDDER_API_URL` | http://localhost:11435/v1 | Embedding endpoint |

| `EMBEDDER_MODEL` | nomic-embed-text | Embedding model |

| `FALKORDB_URI` | redis://localhost:6380 | FalkorDB connection |

| `OBSERVE_LLM` | true | Enable LLM extraction for observe hooks (set `false` for regex-only) |

| `OBSERVE_LLM_MODEL` | *(same as LLM_MODEL)* | Cheaper model for observe hooks (e.g. `deepseek/deepseek-v4-flash`) |

| `HASH_CACHE_LIMIT` | 3000 | Max file hashes to track for skip-if-unchanged |

| `ALLAN_MEMORY_DEBOUNCE_SECONDS` | 60 | Observe hook debounce — wait this long after last read/edit before extracting |

| `ALLAN_MEMORY_DEBOUNCE_DIR` | /tmp/allan-memory-debounce | Where the debounce state markers are stored |

### Observe Hook Hash Cache

Observe hooks use a **content hash cache** (`~/.allan-memory/hashes.json`) to skip reprocessing unchanged files:

- `observe-read`: computes MD5 of file content, **skips** if hash matches a previous run

- `observe-edit`: **always processes** (file was edited), updates hash so next read skips

- Cache is capped at `HASH_CACHE_LIMIT` entries (default 3000), evicts oldest when full

### Observe Hook Debouncing (Async)

The shipped `observe-read.sh` / `observe-edit.sh` hooks are **non-blocking** — they record a marker and exit in ~150ms, then a backgrounded worker waits `ALLAN_MEMORY_DEBOUNCE_SECONDS` (default 60s) before invoking the CLI.

If the same file is touched again within the debounce window, the older worker bails and only the newest one runs. Net effect:

```text

edit foo.js → edit foo.js → edit foo.js   (within 60s)

                                       └─► one extraction, 60s after the last edit

```

This avoids piling up LLM calls when the assistant edits the same file repeatedly. Set `ALLAN_MEMORY_DEBOUNCE_SECONDS=0` to disable (still async).

### Observe Hook LLM Models

The observe hooks (triggered on every file read/edit) can use a **separate, cheaper model** than your main MCP tools:

| Provider | Model | Input/1M | Output/1M | Notes |

| --- | --- | --- | --- | --- |

| **OpenRouter** | `deepseek/deepseek-v4-flash` | $0.10 | $0.20 | Best quality/price |

| **OpenRouter** | `deepseek/deepseek-v4-flash:free` | $0.00 | $0.00 | Free tier |

| **OpenRouter** | `google/gemma-4-31b-it:free` | $0.00 | $0.00 | Free tier, strong code |

| **OpenRouter** | `qwen/qwen3.5-9b` | $0.04 | $0.15 | Very cheap |

| **Ollama** | `qwen2.5:7b-instruct` | Free | Free | Local, ~8 tok/s on M2 |

| **Ollama** | `qwen2.5:3b-instruct` | Free | Free | Fastest local option |

| **Ollama** | `gemma3:4b` | Free | Free | Good for extraction |

| **Disabled** | *(set `OBSERVE_LLM=false`)* | - | - | Regex-only, instant |

> **Tip**: For observe hooks, you only need simple JSON extraction — a small/cheap model works great. Your main `LLM_MODEL` handles the complex MCP tool calls.

---

## Use Cases & Benchmark

### Quick Summary

| Scenario | Queries | Token Savings | Cost Savings |

|----------|---------|---------------|--------------|

| Single Function Query | 1 | -7% | -7% |

| Repeated Queries (Same File) | 5 | **74%** | 74% |

| Cross-File Queries | 5 | **78%** | 78% |

| Full Codebase Exploration | 10 | **88%** | 88% |

| Long Coding Session | 20 | **94%** | 94% |

> **Break-even: 2 queries** — MCP pays for itself after just 2 queries about the same code.

### Live MCP Performance

Measured on localhost with FalkorDB + OpenRouter (qwen/qwen-2.5-7b-instruct):

| Operation | Time | Notes |

|-----------|------|-------|

| `add_memory` | **17ms** | Episode storage (LLM extraction is async) |

| `search_nodes` | **765ms** | Hybrid search (text + vector + embeddings) |

| `search_facts` | **410ms** | Relationship search |

| `check_freshness` | **358ms** | Staleness detection |

### Detailed Scenario Comparison

#### 1. Single Function Query

*1 query about one function — MCP has slight overhead*

| Metric | Without MCP | With MCP | Savings |

|--------|-------------|----------|---------|

| Input Tokens | 3,400 | 3,650 | -7.4% |

| File Reads | 1 | 1 | 0 saved |

| Est. Cost (Claude 3.5) | $0.0177 | $0.0185 | -4.2% |

#### 2. Repeated Queries (Same File)

*5 queries about the same file — MCP wins big*

| Metric | Without MCP | With MCP | Savings |

|--------|-------------|----------|---------|

| Input Tokens | 17,000 | 4,450 | **73.8%** |

| File Reads | 5 | 1 | 4 saved |

| Memory Cache Hits | 0 | 4 | - |

| Est. Cost (Claude 3.5) | $0.0885 | $0.0508 | **42.5%** |

#### 3. Cross-File Queries

*5 queries across 4 different files*

| Metric | Without MCP | With MCP | Savings |

|--------|-------------|----------|---------|

| Input Tokens | 86,000 | 19,300 | **77.6%** |

| File Reads | 20 | 4 | 16 saved |

| Memory Cache Hits | 0 | 4 | - |

| Est. Cost (Claude 3.5) | $0.2955 | $0.0954 | **67.7%** |

#### 4. Full Codebase Exploration

*10 queries exploring entire codebase (6 files)*

| Metric | Without MCP | With MCP | Savings |

|--------|-------------|----------|---------|

| Input Tokens | 248,000 | 28,600 | **88.5%** |

| File Reads | 60 | 6 | 54 saved |

| Memory Cache Hits | 0 | 9 | - |

| Est. Cost (Claude 3.5) | $0.8190 | $0.1608 | **80.4%** |

#### 5. Long Coding Session

*Simulates 1-hour session with 20 queries*

| Metric | Without MCP | With MCP | Savings |

|--------|-------------|----------|---------|

| Input Tokens | 496,000 | 30,600 | **93.8%** |

| File Reads | 120 | 6 | 114 saved |

| Memory Cache Hits | 0 | 19 | - |

| Est. Cost (Claude 3.5) | $1.6380 | $0.2418 | **85.2%** |

### Break-Even Analysis

```

Query 1: MCP has slight overhead (storing knowledge)

  Without MCP: ~3,200 tokens (read file)

  With MCP:    ~3,350 tokens (read + store)

  Result: MCP slightly more expensive

Query 2: MCP wins

  Without MCP: ~3,200 tokens (read file again)

  With MCP:    ~100 tokens (search_nodes hit)

  Cumulative:  6,400 vs 3,450 tokens

Query 3+: MCP advantage compounds

  Each additional query saves ~3,100 tokens

```

### When to Use MCP

| Scenario | Recommendation | Expected Savings |

|----------|----------------|------------------|

| Single one-off question | ❌ Skip MCP | - |

| 2-5 related questions | ✅ Use MCP | 50-80% |

| Long coding session (1hr+) | ✅ Use MCP | 80-95% |

| Large codebase (50+ files) | ✅ Use MCP | 90%+ |

| Team with shared memory | ✅ Use MCP | 95%+ |

### Run Benchmark Yourself

```bash

cd allan-mcp-memory-code

node benchmark.js

```

---

## Freshness Checking

When code changes, stored memories become **stale**. Use `check_freshness` to detect this.

### How It Works

```

1. AI calls check_freshness({ query, group_id, max_age_hours })

                              │

                              ▼

2. Generate query embedding via Ollama/OpenRouter

                              │

                              ▼

3. Hybrid search in FalkorDB (text + vector)

                              │

                              ▼

4. For each result, calculate age:

   - age = now - created_at

   - status = age < max_age_hours ? FRESH : STALE

                              │

                              ▼

5. Return formatted results:

   

   ---

   Found 3 memories: 2 FRESH, 1 STALE (threshold: 24h)

   login [FUNCTION] FRESH 2h ago | src/auth.js

   validateToken [FUNCTION] FRESH 5h ago | src/auth.js  

   hashPassword [FUNCTION] STALE 3d ago | src/crypto.js

                              │

                              ▼

6. AI decides:

   - FRESH → Trust memory, use directly

   - STALE → Re-read file, update with add_memory

```

### MCP Tool Usage

```javascript

// Check freshness before using cached knowledge

check_freshness({

  query: "auth functions",

  group_id: "my-project",

  max_age_hours: 24  // optional, default 24

})

```

### HTTP API Usage

```bash

curl -X POST http://localhost:19089/v1/memory/check-freshness \

  -H "Content-Type: application/json" \

  -d '{"query":"auth functions","group_id":"my-project","max_age_hours":24}'

```

### Best Practice Workflow

```

1. search_nodes("function X")

2. Results found?

   ├─ Yes → check_freshness("function X")

   │        ├─ FRESH → Use memory ✓

   │        └─ STALE → regenerate_file(file_path) OR re-read → add_memory

   └─ No → Read file → add_memory (create)

```

---

## File Regeneration

After editing a file, use `regenerate_file` to automatically update the knowledge graph.

### How It Works

```

1. AI calls regenerate_file({ file_path, project_root, group_id })

                              │

                              ▼

2. MCP reads file content from disk

                              │

                              ▼

3. Check .gitignore / .dockerignore (skip if matched)

                              │

                              ▼

4. LLM extracts structured entities:

   - File: purpose, exports, dependencies

   - Functions: name, line numbers, signature, description

                              │

                              ▼

5. Sync with knowledge graph:

   - CREATE new entities for new functions

   - UPDATE existing entities if changed

   - DELETE entities for removed functions

                              │

                              ▼

6. Return summary:

   {

     "file_path": "src/auth.js",

     "status": "success",

     "created": ["func:project:src/auth.js:10-45@login"],

     "updated": ["file:project:src/auth.js"],

     "deleted": ["func:project:src/auth.js:80-100@oldFunc"]

   }

```

### MCP Tool Usage

```javascript

// After editing a file, regenerate its entities

regenerate_file({

  file_path: "src/auth/login.js",      // relative or absolute

  project_root: "/path/to/project",    // project root directory

  group_id: "my-project"               // required

})

```

### HTTP API Usage

```bash

curl -X POST http://localhost:19089/v1/memory/regenerate-file \

  -H "Content-Type: application/json" \

  -d '{"file_path":"src/auth.js","project_root":"/path/to/project","group_id":"my-project"}'

```

### Ignored Files

Respects `.gitignore` and `.dockerignore` patterns, plus:

- `node_modules/`

- `.git/`

- `dist/`, `build/`

- `*.min.js`, `*.map`

- `package-lock.json`, `yarn.lock`

---

## Local Development

```bash

# Start dependencies only

docker compose up falkordb ollama ollama-init -d

# Run locally

cp .env.example .env

npm install

npm run build   # Compile lib/ → dist/ (Babel transpilation)

npm run dev     # Dev server with nodemon (auto-transpiles via @babel/register)

npm start       # Production (runs transpiled dist/)

```

> **Note:** The `lib/` source uses ES module `import` syntax. Always run `npm run build` before `npm start`. The `dev` script handles transpilation on-the-fly via `@babel/register`.

---

## Architecture

```

lib/

├── index.js                # Entry point

├── domain/                 # Entities + Repository Interfaces

├── application/use_cases/  # Business logic

├── interface/              # Controllers, Routes, Repositories

└── infrastructure/         # Gateways (FalkorDB, LLM, Embedder)

```

## License

[GNU AGPL-3.0](LICENSE) — you may use, modify, and redistribute this software, but any redistribution or network/SaaS deployment must release the complete corresponding source (including your modifications) under the same license.
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/never00miss/allan-mcp-memory-code

Awesome Lists containing this project

README