https://github.com/dfrostar/neuralmind
π§ Adaptive Neural Knowledge System - 40-70x token reduction for AI code understanding
https://github.com/dfrostar/neuralmind
ai ai-coding chromadb code-understanding context-window knowledge-graph llm mcp semantic-search token-reduction
Last synced: 5 days ago
JSON representation
π§ Adaptive Neural Knowledge System - 40-70x token reduction for AI code understanding
- Host: GitHub
- URL: https://github.com/dfrostar/neuralmind
- Owner: dfrostar
- License: mit
- Created: 2026-04-16T20:19:29.000Z (about 1 month ago)
- Default Branch: main
- Last Pushed: 2026-04-22T00:54:00.000Z (29 days ago)
- Last Synced: 2026-04-22T01:03:18.108Z (29 days ago)
- Topics: ai, ai-coding, chromadb, code-understanding, context-window, knowledge-graph, llm, mcp, semantic-search, token-reduction
- Language: Python
- Homepage: https://dfrostar.github.io/neuralmind/
- Size: 1.11 MB
- Stars: 6
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE
- Security: SECURITY.md
Awesome Lists containing this project
README
# π§ NeuralMind
[](https://pypi.org/project/neuralmind/)
[](https://opensource.org/licenses/MIT)
[](https://www.python.org/downloads/)
[](https://github.com/dfrostar/neuralmind/actions/workflows/ci-benchmark.yml)
[](#-security--compliance)
[](#-security--compliance)
**Semantic code intelligence for AI coding agents β smart context retrieval + tool-output compression in one package.**
> NeuralMind turns a code repository into a queryable neural index. AI agents use it to answer code questions in ~800 tokens instead of loading 50,000+ tokens of raw source.
> **π New in v0.6.0** β Obsidian-style graph view with a **live activity feed**. `neuralmind serve` streams synapse + file events to the canvas in real time, so you can *watch the brain learning your codebase*. [Release notes](RELEASE_NOTES_v0.6.0.md) Β· [Graph view section β](#-graph-view-neuralmind-serve)
> **π [Visit the landing page](https://dfrostar.github.io/neuralmind/) β’ π [Read the About page](https://dfrostar.github.io/neuralmind/about.html) β’ βοΈ Not affiliated with NeuralMind.ai**
---
## β‘ 30-second proof
Don't trust the headline number β reproduce it. One command on a freshly cloned checkout:
```bash
git clone https://github.com/dfrostar/neuralmind && cd neuralmind
bash scripts/demo.sh
```
The script creates an isolated venv, installs the deps, builds the index for the bundled fixture (`tests/fixtures/sample_project/`), and runs three real questions. Output looks like:
```
Q: How does authentication work in this codebase?
naive = 4,736 tok neuralmind = 829 tok reduction = 5.7Γ
Q: What are the main API endpoints?
naive = 4,736 tok neuralmind = 923 tok reduction = 5.1Γ
Q: Explain the billing flow from a user perspective.
naive = 4,736 tok neuralmind = 826 tok reduction = 5.7Γ
Average reduction: 5.5Γ across 3 queries
Avg context size: 859 tokens (vs 4,736 naive)
Est. monthly saved: ~$34.89 @ 100 queries/day on Claude 3.5 Sonnet
Wall time: 0.85s
```
The fixture is intentionally small (~500 lines) β it catches regressions in CI. Real repos consistently hit **40β70Γ** on the same pipeline ([benchmarks](#-benchmarks) Β· [community submissions](#community-benchmarks)). Once the demo convinces you, run it on your own code:
```bash
pip install neuralmind graphifyy
cd /path/to/your-repo
graphify update . && neuralmind build .
neuralmind benchmark . --contribute
```
---
## π The fact-based case
Two docs you should read before forming an opinion. Both are linked from this section so you can pick what you need:
- **[docs/BUSINESS-CASE.md](docs/BUSINESS-CASE.md)** β the compelling pitch, with provable numbers. Every claim is a single command away from being verified on your own code. ROI math with assumptions you can change. Three concrete scenarios. Read this if you're evaluating whether to bring NeuralMind to your team.
- **[docs/HONEST-ASSESSMENT.md](docs/HONEST-ASSESSMENT.md)** β the skeptic's companion. When NeuralMind isn't worth installing. What "40β70Γ" actually means (and doesn't). Where the community-benchmark sample is too small to extrapolate. Read this if you want to know what could go wrong before adopting.
The headline you can stand on: **retrieval reduction is measured in CI on every commit** (open any closed PR in the [PR list](https://github.com/dfrostar/neuralmind/pulls?q=is%3Apr+is%3Aclosed) β each one has a sticky benchmark comment with current numbers) and **reproduces in 30 seconds on a fresh clone** via `bash scripts/demo.sh`. Real-world repos have submitted **46β66Γ** but n=2 β your number comes from `neuralmind benchmark . --contribute` on your code.
---
## π Security & Compliance
**For enterprises and regulated industries:**
- **100% Local Processing** β Your code never leaves your machine. All embeddings are generated and stored locally using ChromaDB.
- **No External APIs** β NeuralMind runs completely offline. No cloud services, no telemetry, no data exfiltration.
- **Explainable AI** β Every context decision is auditable. Know exactly which code was retrieved (Extracted) vs. inferred by the model.
- **Open-Source & MIT Licensed** β Full transparency. No hidden clauses, no vendor lock-in. Audit the code yourself.
- **GDPR/HIPAA-Friendly** β Process sensitive code without compliance concerns. All data stays under your control.
**For CTOs & Security Teams:**
- β
Zero external dependencies for code storage
- β
Runs behind your firewall or on-premise
- β
No collection of usage metrics or code patterns
- β
Predictable, reproducible results (not a black box)
- β
Works with proprietary and sensitive code
[See full security policy](SECURITY.md)
---
## π For AI Coding Agents
> **If you are Claude Code, Codex, Cursor, Cline, or any AI agent with NeuralMind installed, start here.**
### Check if NeuralMind is active
```bash
neuralmind stats .
```
Expected output when ready:
```
Project: your-project
Built: True
Nodes: 241
```
If `Built: False`, run `graphify update . && neuralmind build .` first.
---
### Startup workflow
**Every new session, do this first:**
```bash
neuralmind wakeup .
```
Or via MCP:
```
neuralmind_wakeup(project_path=".")
```
This returns ~365β600 tokens of structured project context:
- Project name and description (from `CLAUDE.md`, `mempalace.yaml`, or `README.md` first line)
- How many code entities and clusters are indexed
- Architecture overview: top 10 code clusters with their entity types and sample names
- Sections from `graphify-out/GRAPH_REPORT.md` if present
**Use this output as your orientation before writing any code.** It replaces reading the entire repository.
---
### Decision tree β which tool to call
```
Need to understand the project?
βββΊ neuralmind wakeup . (MCP: neuralmind_wakeup) ~400 tokens
Answering a specific code question?
βββΊ neuralmind query . "question" (MCP: neuralmind_query) ~800β1100 tokens
About to open a source file?
βββΊ neuralmind skeleton (MCP: neuralmind_skeleton) ~5β15Γ cheaper than Read
β Only fall back to Read when you need the actual implementation body
β Use NEURALMIND_BYPASS=1 when you truly need raw source
Answering a complex, multi-part question?
βββΊ neuralmind recursive-query . "q" (MCP: neuralmind_recursive_query) decomposes + synthesizes
Question about reference documents (PDFs, legal, clinical)?
βββΊ neuralmind query-docs . "q" (MCP: neuralmind_query_docs) searches doc index only
Searching for a specific function/class/entity?
βββΊ neuralmind search . "term" (MCP: neuralmind_search) ranked by semantic similarity
Made code changes and need to update the index?
βββΊ neuralmind build . (MCP: neuralmind_build) incremental β only re-embeds changed nodes
```
---
### Understanding the output
#### `wakeup` / `query` output format
```
## Project: myapp
Full-stack web app for task management. Uses React 18, Node.js, and PostgreSQL.
Knowledge Graph: 241 entities, 23 clusters
Type: Code repository with semantic indexing
## Architecture Overview
### Code Clusters
- Cluster 5 (45 entities): function β authenticate_user, hash_password, verify_token
- Cluster 12 (23 entities): class β UserController, AuthMiddleware, SessionStore
- Cluster 3 (18 entities): function β createTask, updateTask, deleteTask
...
## Relevant Code Areas β query only; absent from wakeup
### Cluster 5 (relevance: 1.73)
Contains: function entities
- authenticate_user (code) β auth.py
- verify_token (code) β auth.py
## Search Results β query only
- AuthMiddleware (score: 0.91) β middleware.py
- jwt_handler (score: 0.85) β auth/jwt.py
---
Tokens: 847 | 59.0x reduction | Layers: L0, L1, L2, L3 | Communities: [5, 12]
```
**Layer meanings:**
| Layer | Name | Always loaded | Content |
|-------|------|--------------|---------|
| L0 | Identity | β
yes | Project name, description, graph size |
| L1 | Summary | β
yes | Architecture, top clusters, GRAPH_REPORT sections |
| L2 | On-demand | query only | Top 3 clusters most relevant to the query |
| L3 | Search | query only | Semantic search hits (up to 10) |
#### `skeleton` output format
```
# src/auth/handlers.py (community 5, 8 functions)
## Functions
L12 authenticate_user β Validates credentials and issues JWT
L45 verify_token β Checks JWT signature and expiry
L78 refresh_token β Issues new JWT from a valid refresh token
L102 logout β Revokes refresh token in DB
## Call graph (within this file)
authenticate_user β verify_token, hash_password
refresh_token β verify_token
## Cross-file
verify_token imports_from β utils/jwt.py (high 0.95)
authenticate_user shares_data_with β models/user.py (high 0.91)
[Full source available: Read this file with NEURALMIND_BYPASS=1]
```
Use `skeleton` to understand what a file does, how its functions relate, and which other files it depends on β **without consuming tokens on the full source body**.
#### `search` output format
```
1. authenticate_user (function) - score: 0.92
File: auth/handlers.py Community: 5
2. AuthMiddleware (class) - score: 0.87
File: auth/middleware.py Community: 5
3. hash_password (function) - score: 0.81
File: utils/crypto.py Community: 5
```
---
### PostToolUse hooks β what happens automatically
If `neuralmind install-hooks` has been run for this project (check for `.claude/settings.json`), Claude Code automatically compresses tool outputs **before you see them**:
| Tool | What happens | Typical savings |
|------|-------------|----------------|
| **Read** | Raw source β graph skeleton (functions, rationales, call graph) | ~88% |
| **Bash** | Full output β error lines + warning lines + last 3 lines + summary | ~91% |
| **Grep** | Unlimited matches β capped at 25 + "N more hidden" pointer | varies |
**This is fully automatic β you do not need to call any extra tools.**
To bypass compression for a single command (e.g., when you need the full file body):
```bash
NEURALMIND_BYPASS=1
```
---
### After making code changes
The index does **not** auto-update unless a git post-commit hook was installed with `neuralmind init-hook .`. After significant code changes, rebuild manually:
```bash
neuralmind build . # incremental β only re-embeds changed nodes
neuralmind build . --force # full rebuild β re-embeds everything
```
---
### MCP tool quick reference
| Tool | When to call | Required params | Returns |
|------|-------------|----------------|---------|
| `neuralmind_wakeup` | Session start | `project_path` | L0+L1 context string, token count |
| `neuralmind_query` | Code question | `project_path`, `question` | L0βL3 context string, token count, reduction ratio |
| `neuralmind_search` | Find entity | `project_path`, `query` | List of nodes with scores, file paths |
| `neuralmind_skeleton` | Explore file | `project_path`, `file_path` | Functions + rationales + call graph + cross-file edges |
| `neuralmind_recursive_query` | Complex question | `project_path`, `question` | Synthesized answer, sub-queries, gaps, sources |
| `neuralmind_query_docs` | Reference docs | `project_path`, `question` | Relevant doc chunks with source files and relevance scores |
| `neuralmind_stats` | Check status | `project_path` | Built status, node count, community count |
| `neuralmind_build` | Rebuild index | `project_path` | Build stats dict |
| `neuralmind_benchmark` | Measure savings | `project_path` | Per-query token counts and reduction ratios |
---
## β‘ Two-phase optimization
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Phase 1: Retrieval β what to fetch β
β neuralmind wakeup . β ~365 tokens (vs 50K raw) β
β neuralmind query "?" β ~800 tokens (vs 2,700 raw) β
β neuralmind_skeleton β graph-backed file view β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Phase 2: Consumption β what the agent actually sees β
β PostToolUse hooks compress Read/Bash/Grep output β
β File reads β graph skeleton (~88% reduction) β
β Bash output β errors + summary (~91% reduction) β
β Search results β capped at 25 matches β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
**Combined effect: 5β10Γ total reduction vs baseline Claude Code.**
---
## π― The Problem
```
You: "How does authentication work in my codebase?"
β Traditional: Load entire codebase β 50,000 tokens β $0.15β$3.75/query
β
NeuralMind: Smart context β 766 tokens β $0.002β$0.06/query
```
## π° Realistic savings
The dollar figures depend on **your** workload. Run `neuralmind benchmark . --contribute` to get numbers for your codebase and query volume. Order-of-magnitude expectations:
| You today | NeuralMind likely saves | Setup pays back in |
|---|---|---|
| <$50/mo on LLM, small repo | $5β15/mo | months β probably skip |
| $50β500/mo, 10K+ line repo | $20β200/mo | days |
| $500β5,000/mo team workload | hundredsβthousands/mo | hours |
| Already using prompt caching + long context | smaller marginal win | measure first |
These are **directional**. The [Honest Assessment](docs/HONEST-ASSESSMENT.md) explains why retrieval-token reduction (40β70Γ) β end-to-end cost reduction (3β10Γ typical), and when NeuralMind is and isn't worth installing.
---
## β
Does it work on *your* code? Prove it in 5 minutes.
> NeuralMind [benchmarks itself in CI](#-benchmarks) on every PR. But your codebase isn't our fixture. The only way to know what it does for **you** is to measure it on **your code**.
```bash
pip install neuralmind graphifyy
cd /path/to/your-project
graphify update . && neuralmind build .
neuralmind benchmark .
```
You'll get back your actual reduction ratio and per-query token count β typically **30β80Γ on real repos**. No telemetry, nothing uploaded, nothing committed. If the numbers don't justify it, `pip uninstall neuralmind` and move on β 5 minutes lost.
**Want the dollar figure for your team?**
```bash
neuralmind benchmark . --contribute
```
That flag produces a ready-to-share JSON blob with your project's numbers, the exact command that produced them, and an estimated monthly savings at your query volume. Paste it into Slack, a design doc, a PR β or optionally [contribute it to the public leaderboard](#community-benchmarks).
**Full walkthrough:** [Does NeuralMind work on *your* codebase?](docs/use-cases/benchmark-your-repo.md)
---
## π¨ When do I reach for NeuralMind?
Two ways to decide: start with what's annoying you (**symptoms**), or start with what you're trying to achieve (**goals**).
### Symptoms β "This is happening to me"
| What you notice | Reach for | Why it fixes it |
|---|---|---|
| Claude Code hits context limits mid-task | `neuralmind install-hooks .` | Auto-compresses Read/Bash/Grep **before** the agent sees them (~88β91%) |
| My monthly LLM bill is climbing | `neuralmind query` + hooks | 40β70Γ fewer tokens per code question |
| I start every session re-pasting project structure | `neuralmind wakeup .` | ~400 tokens of orientation; pipe into any chat |
| Agent reads a 2,000-line file to answer about one function | `neuralmind skeleton ` | Functions + call graph, no body; ~88% cheaper than `Read` |
| `grep` floods the agent with hundreds of matches | `neuralmind install-hooks .` | Caps at 25 matches with "N more hidden" pointer |
| The agent is confidently wrong about what my code does | Start session with `wakeup`; ask with `query` | Grounds the model in real structure instead of guessing |
| I want to query my codebase from ChatGPT / Gemini | `neuralmind wakeup . \| pbcopy` | Model-agnostic output; paste into any chat |
| Retrieval feels random across similar questions | `neuralmind learn .` | Cooccurrence-based reranking adapts to your patterns |
| Index feels out of date after a refactor | `neuralmind build .` (or `init-hook` once) | Incremental β only re-embeds changed nodes |
### Goals β "What am I trying to solve for?"
| If your goal is⦠| Do this | Expected outcome |
|---|---|---|
| **Cut LLM spend** on code Q&A | `install-hooks` + use `query` for questions | 5β10Γ total reduction vs baseline agent |
| **Faster, more grounded** agent responses | `wakeup` at session start β `query` / `skeleton` during | Fewer hallucinations; less re-exploration |
| **Keep all code local** (no SaaS, no telemetry) | Default install β no extra config | 100% offline; nothing leaves the machine |
| **Work across Claude + GPT + Gemini** with one index | Build once, pipe output into any model | Same context quality, model-agnostic |
| **Make retrieval adapt** to how your team queries | Enable memory (TTY prompt) + `neuralmind learn .` | Relevance improves on repeat patterns |
| **Measure savings** for a manager or stakeholder | `neuralmind benchmark . --json` | Per-query tokens, reduction ratios, dollar estimate |
| **Auto-refresh** the index as code changes | `neuralmind init-hook .` (git post-commit) | Every commit rebuilds incrementally |
### Still not sure?
You **probably don't need NeuralMind** if:
- Your codebase is under ~5K tokens total (just paste the whole thing in).
- You don't use an AI coding agent.
- You only want inline completions β use [Copilot](docs/comparisons/vs-github-copilot.md) or [Cursor](docs/comparisons/vs-cursor-codebase.md) directly.
You **almost certainly want NeuralMind** if any row above describes a recurring frustration, or if your LLM bill has crossed the point where a 40β70Γ reduction is worth 5 minutes of setup.
See the [use-case walkthroughs](docs/use-cases/README.md) for step-by-step guides matched to your situation.
---
## π’ For organizations evaluating NeuralMind
If you're building a pitch for your team β finance, healthcare, legal, government, internal-platform, or just a large engineering org with a climbing LLM bill β start with **[docs/BUSINESS-CASE.md](docs/BUSINESS-CASE.md)** for the fact-based ROI argument and **[docs/ENTERPRISE.md](docs/ENTERPRISE.md)** for the regulated/on-premise/multi-team scenarios.
Both docs ground every claim in something you can verify with one command on your own code.
---
## βοΈ NeuralMind vs. Alternatives
Short answers to "why not just use X?". Each row links to a deeper page.
| Compared against | Short verdict |
|---|---|
| [Cursor `@codebase`](docs/comparisons/vs-cursor-codebase.md) | Works *only* in Cursor; NeuralMind works in any agent and adds tool-output compression |
| [Aider repo-map](docs/comparisons/vs-aider-repomap.md) | Aider is syntactic only; NeuralMind adds semantic retrieval and compression |
| [Sourcegraph Cody](docs/comparisons/vs-cody.md) | Cody is server-hosted and org-wide; NeuralMind is local and per-project |
| [Continue / Cline](docs/comparisons/vs-continue-cline.md) | Those are agent runtimes; NeuralMind is the context/compression layer underneath |
| [GitHub Copilot](docs/comparisons/vs-github-copilot.md) | Copilot is hosted completions; NeuralMind is local context for any agent |
| [Windsurf / Codeium](docs/comparisons/vs-windsurf-codeium.md) | Vertically integrated IDE; NeuralMind is editor- and model-agnostic |
| [Claude Projects](docs/comparisons/vs-claude-projects.md) | Projects reload all files every turn; NeuralMind retrieves only what the query needs |
| [Prompt caching](docs/comparisons/vs-prompt-caching.md) | Caching amortizes a big prompt; NeuralMind makes the prompt small β combine both |
| [LangChain / LlamaIndex for code](docs/comparisons/vs-langchain-llamaindex.md) | Frameworks you assemble; NeuralMind is the assembled default for code agents |
| [Long context windows (1M/2M)](docs/comparisons/vs-long-context.md) | Possible β cheap β NeuralMind gives ~60Γ cost reduction on the same model |
| [Generic RAG over a codebase](docs/comparisons/vs-rag.md) | Text chunking loses structure; NeuralMind keeps the call graph |
| [Tree-sitter / ctags / grep](docs/comparisons/vs-treesitter-ctags.md) | Deterministic but syntactic; use alongside NeuralMind, not instead of |
Full comparison index: [docs/comparisons/](docs/comparisons/README.md).
---
## π Quick Start (humans)
```bash
# Install (includes the CLI, semantic indexing, and the MCP server
# for Claude Code, Cursor, Cline, Continue, and any MCP client)
pip install neuralmind graphifyy
# Go to your project
cd your-project
# Generate knowledge graph (requires graphify)
graphify update .
# Build neural index
neuralmind build .
# (Optional) Install Claude Code PostToolUse compression hooks
neuralmind install-hooks .
# (Optional) Auto-rebuild on every git commit
neuralmind init-hook .
# Start using
neuralmind wakeup .
neuralmind query . "How does authentication work?"
neuralmind skeleton src/auth/handlers.py
# Or browse it: Obsidian-style graph view of your codebase + learned synapses
neuralmind serve .
```
---
## πΈοΈ Graph view (`neuralmind serve`)
`neuralmind serve` opens a local web UI that makes the same index your AI agent
queries inspectable by a human. Same ChromaDB index, same `synapses.db`, just
made navigable.
- **Force-directed graph** of code nodes coloured by community.
- **Structural edges** (calls / imports) layered with the **Hebbian synapse overlay** β
edges thicken as the brain learns which nodes co-activate.
- **Backlinks, outgoing links, and synaptic neighbours** for any node you click,
Obsidian-style.
- **Semantic quick-switcher** β type a phrase, jump to the node.
- **Open in editor** β click a node, opens `$EDITOR` (or `--editor code`/`cursor`/
`vim`/`subl`/`idea`) at the right file and line.
- **Local-first**: stdlib HTTP server, vanilla-JS canvas, no CDN, per-session
access token bound to 127.0.0.1 by default.
```bash
neuralmind serve . # opens http://127.0.0.1:8765/?token=β¦
neuralmind serve . --editor "code -n" # override the editor
neuralmind serve . --no-auth # skip the token (trusted hosts only)
```
Why it matters: the agent-facing brain has always been a black box β you couldn't
see what NeuralMind retrieved, whether the graph was reasonable, or what the
synapse layer had actually learned. The graph view exposes all three.
**Coming next (graph-view Phase B):** a
[replay-last-query overlay](https://github.com/dfrostar/neuralmind/pull/105)
that highlights the L3 hits the agent received,
[edge tooltips + a min-weight synapse slider](https://github.com/dfrostar/neuralmind/pull/106)
answering "why are these two nodes related?", pin UX, and a
`Cmd/Ctrl-K` quick-switch. Then Phase C: a live activity feed of
synapse co-activations. Full plan in [ROADMAP.md](ROADMAP.md).
---
## π§ How It Works
NeuralMind wraps a graphify knowledge graph (`graphify-out/graph.json`) in a ChromaDB vector store.
When you query it, a 4-layer progressive disclosure system loads only the context relevant to
your question.
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Layer 0: Project Identity (~100 tokens) β ALWAYS LOADED β
β Source: CLAUDE.md / mempalace.yaml / README first line β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Layer 1: Architecture Summary (~500 tokens) β ALWAYS LOADED β
β Source: Community distribution + GRAPH_REPORT.md β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Layer 2: Relevant Modules (~300β500 tokens) β QUERY-AWARE β
β Source: Top 3 clusters semantically matching the query β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Layer 3: Semantic Search (~300β500 tokens) β QUERY-AWARE β
β Source: ChromaDB similarity search over all graph nodes β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Total: ~800β1,100 tokens vs 50,000+ for the full codebase
```
**Prerequisites:** NeuralMind requires `graphify update .` to have been run first. This produces:
- `graphify-out/graph.json` β the knowledge graph (required)
- `graphify-out/GRAPH_REPORT.md` β architecture summary (enriches L1, optional)
- `graphify-out/neuralmind_db/` β ChromaDB vector store (created by `neuralmind build`)
---
## π₯οΈ Complete CLI Reference
### `neuralmind build`
Build or incrementally update the neural index from `graphify-out/graph.json`.
```bash
neuralmind build [project_path] [--force]
```
| Argument/Option | Default | Description |
|----------------|---------|-------------|
| `project_path` | `.` | Project root containing `graphify-out/graph.json` |
| `--force`, `-f` | off | Re-embed every node even if unchanged |
```bash
neuralmind build .
neuralmind build /path/to/project --force
```
Output: nodes processed, added, updated, skipped, communities indexed, build duration.
---
### `neuralmind wakeup`
Get minimal project context for starting a session (~400β600 tokens, L0 + L1 only).
```bash
neuralmind wakeup [--json]
```
```bash
neuralmind wakeup .
neuralmind wakeup . --json
neuralmind wakeup . > CONTEXT.md
```
---
### `neuralmind query`
Query the codebase with natural language (~800β1,100 tokens, all 4 layers).
```bash
neuralmind query "" [--json]
```
```bash
neuralmind query . "How does authentication work?"
neuralmind query . "What are the main API endpoints?" --json
neuralmind query /path/to/project "Explain the database schema"
```
On first run from a TTY, you will be prompted once to enable local query memory logging.
Disable with `NEURALMIND_MEMORY=0`.
---
### `neuralmind search`
Direct semantic search β returns code entities ranked by similarity to the query.
```bash
neuralmind search "" [--n N] [--json]
```
| Option | Default | Description |
|--------|---------|-------------|
| `--n` | 10 | Maximum number of results |
| `--json`, `-j` | off | Machine-readable JSON output |
```bash
neuralmind search . "authentication"
neuralmind search . "database connection" --n 5
neuralmind search . "PaymentController" --json
```
---
### `neuralmind skeleton`
Print a compact graph-backed view of a file without loading full source (~88% cheaper than Read).
```bash
neuralmind skeleton [--project-path .] [--json]
```
| Option | Default | Description |
|--------|---------|-------------|
| `--project-path` | `.` | Project root (where the index lives) |
| `--json`, `-j` | off | Machine-readable JSON output |
```bash
neuralmind skeleton src/auth/handlers.py
neuralmind skeleton src/auth/handlers.py --project-path /my/project
neuralmind skeleton src/auth/handlers.py --json
```
Output: function list with line numbers and rationales, internal call graph, cross-file edges
(imports, data sharing), and a pointer to the full source for when you need it.
---
### `neuralmind benchmark`
Measure token reduction using a set of sample queries.
```bash
neuralmind benchmark [--json]
```
```bash
neuralmind benchmark .
neuralmind benchmark . --json
```
---
### `neuralmind stats`
Show index status and statistics.
```bash
neuralmind stats [--json]
```
```bash
neuralmind stats .
neuralmind stats . --json # {"built": true, "total_nodes": 241, "communities": 23, ...}
```
---
### `neuralmind learn`
Analyze logged query history to discover module cooccurrence patterns. Improves future query
relevance automatically.
```bash
neuralmind learn
```
```bash
neuralmind learn .
```
Reads `.neuralmind/memory/query_events.jsonl`, writes `.neuralmind/learned_patterns.json`.
The next `neuralmind query` applies boosted reranking automatically.
---
### `neuralmind install-hooks`
Install or remove Claude Code PostToolUse compression hooks.
```bash
neuralmind install-hooks [project_path] [--global] [--uninstall]
```
| Option | Description |
|--------|-------------|
| `--global` | Install in `~/.claude/settings.json` (affects all projects) |
| `--uninstall` | Remove NeuralMind hooks only; preserves other tools' hooks |
```bash
neuralmind install-hooks . # project-scoped
neuralmind install-hooks --global # all projects
neuralmind install-hooks --uninstall # remove project hooks
neuralmind install-hooks --uninstall --global # remove global hooks
```
---
### `neuralmind init-hook`
Install a Git `post-commit` hook that auto-rebuilds the index after every commit.
Safe and idempotent β coexists with other tools' hook contributions.
```bash
neuralmind init-hook [project_path]
```
```bash
neuralmind init-hook .
neuralmind init-hook /path/to/project
```
---
## π MCP Server
NeuralMind ships a Model Context Protocol server (`neuralmind-mcp`) that exposes all tools
to MCP-compatible agents.
### Starting the server
```bash
neuralmind-mcp
# or
python -m neuralmind.mcp_server
```
### Claude Desktop configuration
```json
{
"mcpServers": {
"neuralmind": {
"command": "neuralmind-mcp",
"args": ["/absolute/path/to/project"]
}
}
}
```
Config file locations:
- **macOS**: `~/Library/Application Support/Claude/claude_desktop_config.json`
- **Windows**: `%APPDATA%\Claude\claude_desktop_config.json`
- **Linux**: `~/.config/Claude/claude_desktop_config.json`
### Claude Code / Cursor project-scoped auto-registration
Drop a `.mcp.json` at your project root:
```json
{
"mcpServers": {
"neuralmind": {
"command": "neuralmind-mcp",
"args": ["."]
}
}
}
```
### Hermes-Agent (Nous Research)
[Hermes-Agent](https://github.com/nousresearch/hermes-agent) is a self-improving
agent framework that supports MCP servers. NeuralMind has been verified
end-to-end against **Hermes-Agent v0.12.0 (build 2026.4.30)** β the agent
discovered all 11 NeuralMind tools (4-second handshake) when registered as
shown below.
**Prerequisite:** install NeuralMind. The MCP server (`neuralmind-mcp`)
ships with the default install:
```bash
pip install neuralmind
```
> Older `pip install "neuralmind[mcp]"` commands still work β the `mcp`
> extra is preserved as a no-op for backwards compatibility.
**Two ways to register the server.** Both end up in `~/.hermes/config.yaml`:
*Option A β CLI (recommended for first-time setup):*
```bash
hermes mcp add
```
*Option B β edit the config directly* (`~/.hermes/config.yaml`, add under
the `mcp_servers` top-level key):
```yaml
mcp_servers:
neuralmind:
command: "neuralmind-mcp"
args: ["/absolute/path/to/project"]
```
**Verify** the server is registered and reachable:
```bash
hermes mcp list # neuralmind should appear, status β
hermes mcp test neuralmind # β Connected, β Tools discovered: 11
```
If you haven't installed Hermes-Agent yet, the upstream installer is:
```bash
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
source ~/.bashrc
```
After editing the YAML directly, run `/reload-mcp` from the running `hermes`
CLI to pick up the change without restarting (the `hermes mcp add` flow does
this automatically). Both stdio (shown above) and HTTP transports are
supported β see the upstream
[MCP integration docs](https://hermes-agent.nousresearch.com/docs/user-guide/features/mcp)
for the full schema (`command`, `args`, `env`, `url`, `headers`, `enabled`,
per-server `tools` filtering, `timeout`, `connect_timeout`).
**v0.6.0 graph view works identically here.** Run `neuralmind serve` in
the same project and any tool call from Hermes-Agent will pulse the
corresponding nodes on the canvas. The synapse store is shared with
Claude Code, Cursor, OpenClaw, and any other agent pointed at this
project β see [docs/use-cases/multi-agent.md](docs/use-cases/multi-agent.md).
### OpenClaw
[OpenClaw](https://github.com/openclaw/openclaw) is a personal AI assistant
that registers MCP servers via its CLI. Verified against **OpenClaw 2026.5.2** β
`mcp set` / `mcp list` / `mcp show` round-trip the documented JSON schema
into `~/.openclaw/openclaw.json` exactly as expected.
**Prerequisite:** install NeuralMind (the MCP server ships with the
default install):
```bash
pip install neuralmind
```
**Register** NeuralMind:
```bash
openclaw mcp set neuralmind '{"command":"neuralmind-mcp","args":["/absolute/path/to/project"]}'
```
**Verify** it landed:
```bash
openclaw mcp list # neuralmind should appear
openclaw mcp show neuralmind # echoes the JSON you stored
```
Remove with `openclaw mcp unset neuralmind`. Definitions are stored under
the `mcp.servers` key in `~/.openclaw/openclaw.json`.
**v0.6.0 graph view works identically here.** Run `neuralmind serve` in
the same project and any tool call from OpenClaw will pulse the
corresponding nodes on the canvas. OpenClaw and Claude Code talking
to the same project reinforce the same synapse store β see
[docs/use-cases/multi-agent.md](docs/use-cases/multi-agent.md).
If you haven't installed OpenClaw yet:
```bash
npm install -g openclaw@latest # or: pnpm add -g openclaw@latest
openclaw onboard --install-daemon
```
OpenClaw's MCP support covers stdio (shown above), SSE, HTTP, and
`streamable-http` transports β see the upstream
[MCP CLI reference](https://docs.openclaw.ai/cli/mcp) for details on
`url`/`transport` config and the inverse direction (`openclaw mcp serve`,
which exposes OpenClaw's own channels as an MCP server to other clients).
### Skill (OpenClaw, Agent Zero, Hermes, any SKILL.md host)
The MCP server gives an agent the **actions**. The skill at
[`skills/neuralmind/SKILL.md`](skills/neuralmind/SKILL.md) gives it the
**playbook** β when to call `neuralmind_query` vs. `neuralmind_skeleton`
vs. `neuralmind_search`, what the outputs look like, and which env-var
escape hatches exist. It is a portable Anthropic-style SKILL.md
(frontmatter + markdown body) so the same file works in any host that
implements the spec.
**OpenClaw.** Drop the directory into your ClawHub local skills path, or
ship it as part of an OpenClaw plugin by listing `skills/` in
`openclaw.plugin.json`:
```bash
cp -r skills/neuralmind ~/.openclaw/skills/
openclaw skills list # neuralmind should appear
```
The skill is description-matched on triggers like "how does X work" or
"find function Y", so you don't need to load it explicitly.
**Agent Zero.** Drop the same directory into the Agent Zero skills
folder:
```bash
cp -r skills/neuralmind /path/to/agent-zero/skills/
```
Agent Zero auto-discovers SKILL.md files by description and tag, then
uses its `code_execution_tool` to call the MCP tools the skill names in
its `allowed_tools` frontmatter.
**Hermes-Agent.** Hermes has a first-class
[skills system](https://hermes-agent.nousresearch.com/docs/user-guide/features/skills)
that reads the same SKILL.md spec. Drop the directory into the
category-organised tree:
```bash
mkdir -p ~/.hermes/skills/code-intelligence
cp -r skills/neuralmind ~/.hermes/skills/code-intelligence/
```
Hermes loads skills on demand based on the frontmatter description, so
no further wiring is needed. You can also publish the directory as a
[Hermes tap](https://hermes-agent.nousresearch.com/docs/user-guide/features/skills)
(a GitHub repo of skill directories) for one-command install across
machines. This layers on top of the MCP integration documented in the
[Hermes-Agent section above](#hermes-agent-nous-research) β the MCP
server still does the work; the skill teaches Hermes when to call it.
**Claude Code, Cursor.** These already have richer integrations
(lifecycle hooks for Claude Code, MCP wiring for Cursor), so the
skill is optional. It still works as a portable "agent operating
manual" if you want a single file that travels with the project.
The skill duplicates **none** of NeuralMind's logic β it points the
agent at MCP tools that already exist. Edit it like documentation.
### Troubleshooting
**"Connection closed" / "Connection failed" right after register.** Almost
always means an old NeuralMind install (β€ 0.4.x) where the MCP server was
gated behind the `[mcp]` extra. From 0.5.0 onward the MCP SDK is bundled.
Fix:
```bash
pip install --upgrade neuralmind
```
Then re-run the host's verify step (`hermes mcp test neuralmind` or
`openclaw mcp list`).
**`neuralmind-mcp: command not found`.** The package installed but the
console script wasn't put on PATH β usually because pip installed into a
user site-packages dir that isn't on PATH. Add `~/.local/bin` to PATH or
reinstall in a venv where the entry point is on PATH.
**The host shows neuralmind in `mcp list` but no tools when you query.**
Run `neuralmind build /path/to/project` first β the index has to exist
before the MCP tools can answer queries. The hooks (`SessionStart`,
`UserPromptSubmit`, `PreCompact` from `neuralmind install-hooks`) need a
built index too.
### MCP tool schemas
#### `neuralmind_wakeup`
```json
{
"project_path": "string (required) β absolute path to project root"
}
```
Returns:
```json
{
"context": "string",
"tokens": 412,
"reduction_ratio": 121.4,
"layers": ["L0", "L1"]
}
```
#### `neuralmind_query`
```json
{
"project_path": "string (required)",
"question": "string (required) β natural language question"
}
```
Returns:
```json
{
"context": "string",
"tokens": 847,
"reduction_ratio": 59.0,
"layers": ["L0", "L1", "L2", "L3"],
"communities_loaded": [5, 12],
"search_hits": 8
}
```
#### `neuralmind_search`
```json
{
"project_path": "string (required)",
"query": "string (required)",
"n": 10
}
```
Returns array of:
```json
{ "id": "node_id", "label": "authenticate_user", "file_type": "code",
"source_file": "auth/handlers.py", "score": 0.92 }
```
#### `neuralmind_skeleton`
```json
{
"project_path": "string (required)",
"file_path": "string (required) β absolute or project-relative path"
}
```
Returns:
```json
{ "file": "src/auth/handlers.py", "skeleton": "# src/auth/handlers.py ...", "chars": 620, "indexed": true }
```
#### `neuralmind_recursive_query`
Recursively decompose and explore complex questions. Breaks multi-part questions into focused sub-queries, executes them, identifies gaps, and synthesizes results. Searches both code and document indexes.
```json
{
"project_path": "string (required)",
"question": "string (required) β compound question to decompose",
"max_depth": 3,
"include_docs": true
}
```
Returns:
```json
{
"question": "string",
"answer": "string β synthesized answer",
"sub_queries": [{"query": "string", "results": [...], "source": "string"}],
"depth_reached": 2,
"gaps_identified": ["string"],
"total_queries": 6,
"token_estimate": 4156,
"sources": ["file1.ts", "file2.ts", "doc.md"]
}
```
**When to use:** Multi-faceted questions spanning multiple files or concepts, like "How does auth work and what security measures are in place?" or "What is the deployment architecture and how do Cloudflare and Render interact?"
**Benchmark:** 6x more tokens than standard query, but decomposes compound questions and achieves full term coverage on 3/5 test questions. See `graphify-out/RECURSIVE_QUERY_BENCHMARK.md` after running `benchmark_report.py`.
#### `neuralmind_query_docs`
Search reference documents (legal, clinical, strategic PDFs/DOCX converted to markdown). NOT for code β use `neuralmind_query` for code questions.
```json
{
"project_path": "string (required)",
"question": "string (required) β question about reference documents",
"n": 5
}
```
Returns:
```json
{
"results": [
{
"content": "string β relevant text chunk",
"source_file": "docs/reference/filename.md",
"file_name": "filename.md",
"chunk": "3/12",
"relevance": 0.719
}
],
"total_doc_chunks": 241,
"query": "string"
}
```
**Setup:** Documents must be converted to markdown and indexed first:
```bash
# Convert documents (PDF, DOCX, TXT, HTML β .md)
pip install pypdf mammoth
python doc_indexer.py build /path/to/project
# Or use the doc-ingest skill for batch conversion
```
**Auto-rebuild:** A git post-commit hook can rebuild the doc index when files in `docs/reference/` change.
**Search reference docs via CLI:**
```bash
python doc_indexer.py query /path/to/project "HIPAA compliance"
python doc_indexer.py stats /path/to/project
```
#### `neuralmind_build`
```json
{
"project_path": "string (required)",
"force": false
}
```
Returns:
```json
{
"success": true,
"nodes_total": 241,
"nodes_added": 5,
"nodes_updated": 2,
"nodes_skipped": 234,
"communities": 23,
"duration_seconds": 3.1
}
```
#### `neuralmind_stats`
```json
{ "project_path": "string (required)" }
```
Returns:
```json
{ "built": true, "total_nodes": 241, "communities": 23, "db_path": "..." }
```
#### `neuralmind_benchmark`
```json
{ "project_path": "string (required)" }
```
Returns:
```json
{
"project": "myapp",
"wakeup_tokens": 341,
"avg_query_tokens": 739,
"avg_reduction_ratio": 65.6,
"results": [...]
}
```
---
## πͺ PostToolUse Compression
When `neuralmind install-hooks` has been run, Claude Code automatically applies these transforms
to every tool output before the agent sees it.
### Read β skeleton
Raw source files are replaced with the graph skeleton (functions + rationales + call graph +
cross-file edges). This is ~88% smaller and contains the structural information agents need most.
To get the full source anyway:
```bash
NEURALMIND_BYPASS=1
```
### Bash β filtered output
Long bash output is reduced to:
- All `error`/`ERROR`/`FAIL`/`traceback`/`warning` lines
- All summary lines (`=====`, `passed`, `failed`, `Finished`, `Done in`, etc.)
- Last 3 lines verbatim
- Header: `[neuralmind: bash compressed, exit=N]`
All errors and failures are **always** preserved. Routine pip/npm/build chatter is dropped.
### Grep β capped results
Search results are capped at 25 matches with a `[N more hidden]` note appended.
Prevents context flooding from repository-wide searches.
### Tunable thresholds
| Variable | Default | Description |
|----------|---------|-------------|
| `NEURALMIND_BYPASS` | unset | Set to `1` to disable all compression |
| `NEURALMIND_BASH_TAIL` | `3` | Lines to keep verbatim from end of bash output |
| `NEURALMIND_BASH_MAX_CHARS` | `3000` | Below this size, bash output is not compressed |
| `NEURALMIND_SEARCH_MAX` | `25` | Max grep/search matches before capping |
| `NEURALMIND_OFFLOAD_THRESHOLD` | `15000` | Chars above which content is written to a temp file |
---
## π§ Continual Learning
NeuralMind optionally learns from your query patterns to improve future relevance.
### How it works
1. **Collect** β Each `neuralmind query` logs which modules appeared in the result to
`.neuralmind/memory/query_events.jsonl` (opt-in, local only, zero overhead)
2. **Learn** β `neuralmind learn .` analyzes cooccurrence: which clusters appear together across queries
3. **Improve** β The next `neuralmind query` applies a `+0.3` reranking boost to modules that
co-occur with the current query's top matches
4. **Repeat** β The system gets smarter as you use it
### Opt-in / consent
On first TTY query:
```
NeuralMind can keep local query memory (project + global JSONL) to improve future retrieval.
Enable? [y/N]:
```
Consent saved to `~/.neuralmind/memory_consent.json`. Disable at any time:
```bash
export NEURALMIND_MEMORY=0 # disable query logging
export NEURALMIND_LEARNING=0 # disable pattern application
```
### File locations
```
~/.neuralmind/
βββ memory_consent.json # consent flag
βββ memory/
βββ query_events.jsonl # global event log
/.neuralmind/
βββ memory/
β βββ query_events.jsonl # project-specific events
βββ learned_patterns.json # created by: neuralmind learn .
```
### Privacy
100% local β nothing is sent to any server. Delete `~/.neuralmind/` and `/.neuralmind/`
at any time to remove all learning data.
---
## β° Keeping the Index Fresh
### Automatic β Git post-commit hook (recommended)
```bash
neuralmind init-hook .
```
After every commit, the hook runs:
```bash
neuralmind build . 2>/dev/null && echo "[neuralmind] OK"
```
### Manual
```bash
graphify update .
neuralmind build .
```
### Scheduled β cron
```bash
0 6 * * * cd /path/to/project && graphify update . && neuralmind build .
```
### CI/CD β GitHub Actions
```yaml
- run: pip install neuralmind graphifyy
- run: graphify update . && neuralmind build .
- run: neuralmind wakeup . > AI_CONTEXT.md
```
---
## π Compatibility
| Component | Works With | Notes |
|-----------|-----------|-------|
| **CLI** | Any environment | Pure Python, no daemon required |
| **MCP Server** | Claude Code, Claude Desktop, Cursor, Cline, Continue, any MCP client | Bundled with `pip install neuralmind` |
| **SKILL.md** | OpenClaw (ClawHub), Agent Zero, Hermes-Agent, any SKILL.md host | Portable agent playbook at [`skills/neuralmind/SKILL.md`](skills/neuralmind/SKILL.md) β pairs with the MCP server |
| **PostToolUse Hooks** | Claude Code only | Uses Claude Code's `PostToolUse` hook system |
| **Git hook** | Any git workflow | Appends to existing `post-commit`, idempotent |
| **Copy-paste** | ChatGPT, Gemini, any LLM | `neuralmind wakeup . \| pbcopy` |
### Quick-start by tool
Claude Code β full two-phase optimization
```bash
pip install neuralmind graphifyy
cd your-project
graphify update .
neuralmind build .
neuralmind install-hooks . # PostToolUse compression
neuralmind init-hook . # auto-rebuild on commit (optional)
```
Then use MCP tools in sessions: `neuralmind_wakeup`, `neuralmind_query`, `neuralmind_skeleton`.
Cursor / Cline / Continue β MCP server
```bash
pip install neuralmind graphifyy
graphify update .
neuralmind build .
```
Add to your MCP config:
```json
{ "mcpServers": { "neuralmind": { "command": "neuralmind-mcp" } } }
```
ChatGPT / Gemini / any LLM β CLI + copy-paste
```bash
neuralmind wakeup . | pbcopy # macOS β paste into chat
neuralmind query . "question" # get context for a specific question
```
---
## β¨ What's New in v0.5.4 β Graph view
`neuralmind serve` ships in v0.5.4 β see the
[Graph view section above](#-graph-view-neuralmind-serve). The
**next patch release (v0.5.5)** lands graph-view Phase B: the
replay-last-query overlay
([#105](https://github.com/dfrostar/neuralmind/pull/105)), edge
tooltips + min-weight synapse slider
([#106](https://github.com/dfrostar/neuralmind/pull/106)), pin UX,
and a `Cmd/Ctrl-K` quick-switch. Phase C after that: a live activity
feed of synapse co-activations. Full plan in
[ROADMAP.md](ROADMAP.md).
## β¨ What's New in v0.4.0 β Brain-like Synapse Layer
NeuralMind now runs as a **second brain alongside the LLM**: a
persistent associative memory that learns continuously from how the
agent and the codebase actually interact. See the [release notes](RELEASE_NOTES_v0.4.0.md) for the full story.
| Feature | Details |
|---------|---------|
| **Synapse store** | SQLite-backed weighted graph; Hebbian reinforce, decay, long-term potentiation |
| **Spreading activation** | `mind.synaptic_neighbors(query)` β usage-based recall complementing vector search |
| **`neuralmind watch` daemon** | File edits become co-activation signals; brain learns even when no query runs |
| **Three new Claude Code hooks** | SessionStart (decay+export), UserPromptSubmit (recall injection), PreCompact (hub normalization) |
| **Auto-memory export** | Writes `SYNAPSE_MEMORY.md` to Claude Code's auto-memory dir so associations surface natively |
| **Three new MCP tools** | `synaptic_neighbors`, `synapse_stats`, `synapse_decay`, `export_synapse_memory` |
| **3Γ fewer embedder calls per query** | Selector caches one search per query and slices for L2/L3/synapses |
### Earlier (v0.3.x)
| Feature | Version | Details |
|---------|---------|---------|
| **Memory Collection** | v0.3.0 | Local JSONL storage for query events |
| **Opt-in Consent** | v0.3.0 | One-time TTY prompt, env var overrides |
| **EmbeddingBackend abstraction** | v0.3.1 | Pluggable vector backend (Pinecone/Weaviate ready) |
| **Pattern Learning** | v0.3.2 | `neuralmind learn .` analyzes cooccurrence |
| **Smart Reranking** | v0.3.2 | L3 results boosted by learned patterns |
| **Accurate Build Stats** | v0.3.3 | Correctly distinguishes added vs updated nodes |
| **Documentation polish** | v0.3.4 | CLI flags sync, Setup Guide, agent guidance in README |
---
## π Benchmarks
NeuralMind benchmarks itself on every pull request. A hermetic fixture (`tests/fixtures/sample_project/`) plus a committed query set (`tests/fixtures/benchmark_queries.json`) runs through the full retrieval pipeline, and CI fails if aggregate reduction drops below a conservative floor (currently **4Γ** on the small fixture β the fixture is intentionally tiny, real repos consistently hit 40β70Γ as shown below).

### What CI measures on every PR
- **Phase 1 β Reduction.** Naive baseline (every `.py` file in the fixture concatenated) vs `NeuralMind.query()` output, per query. All tokens counted with `tiktoken`.
- **Phase 2 β Learning uplift.** Same queries run cold, then after seeding memory and running `neuralmind learn`. Reports the delta in reduction ratio and top-k retrieval hit rate. On a 500-line fixture the numerical uplift is modest by design β the test proves the mechanism persists, not that it's magic.
- **Per-model breakdown.** GPT-4o and GPT-4/3.5 counts are *measured* via real tiktoken encodings. Claude uses the Anthropic SDK tokenizer when available, else a clearly-labeled *estimate* derived from published vocab ratios. Llama is always estimated. **No fabricated numbers anywhere.**
- **Memory persistence.** `tests/test_memory_persistence.py` asserts events are logged, `neuralmind learn` produces a patterns file, and subsequent queries load it without error.
### Community benchmarks
Real-world numbers submitted by users. Your code never leaves your machine β you submit a PR (or an issue, which a maintainer converts to a PR) with only the numbers. CI validates every entry against the schema and re-renders this table automatically.
| Project | Lang | Nodes | Wakeup | Avg Query | Reduction | Model | Submitted |
|---------|------|------:|-------:|----------:|----------:|-------|-----------|
| cmmc20 | JavaScript | 241 | 341 | 739 | **65.6Γ** | Claude 3.5 Sonnet | [@dfrostar](https://github.com/dfrostar) Β· 2025-10-01 |
| mempalace | Python | 1,626 | 412 | 891 | **46.0Γ** | Claude 3.5 Sonnet | [@dfrostar](https://github.com/dfrostar) Β· 2025-10-01 |
_2 submission(s). See the [JSON data](docs/community-benchmarks.json) for notes and verification commands._
**Submit yours:**
- **Easy path:** [open a benchmark submission issue](https://github.com/dfrostar/neuralmind/issues/new?template=community-benchmark.yml) β fill out a form, a maintainer converts it to a PR.
- **PR directly:** add an entry to [`docs/community-benchmarks.json`](docs/community-benchmarks.json) and run `python scripts/render_community_table.py --inject README.md` to regenerate the table. Schema: [`community-benchmarks.schema.json`](docs/community-benchmarks.schema.json).
All entries include the exact `neuralmind` command that produced them, so reviewers (and any reader) can audit the numbers.
### Reproduce locally (on our fixture)
```bash
pip install . tiktoken matplotlib graphifyy
graphify update tests/fixtures/sample_project
neuralmind build tests/fixtures/sample_project --force
python -m tests.benchmark.run # phase 1 + phase 2
python -m tests.benchmark.multi_model # per-model breakdown
python scripts/generate_chart.py # refreshes the PNG above
```
Full machine-readable results land in `tests/benchmark/results.json`, human-readable report in `tests/benchmark/report.md`.
### Reproduce on *your* code
Don't just trust numbers from our fixture β run it on your repo:
```bash
pip install neuralmind graphifyy
graphify update . && neuralmind build .
neuralmind benchmark . --contribute
```
Output shows your reduction ratio, tokens per query, and estimated monthly savings at Claude 3.5 Sonnet pricing. Full walkthrough: [**Does NeuralMind work on *your* codebase?**](docs/use-cases/benchmark-your-repo.md)
### Retrieval quality baseline
- Heuristic-only baseline (community-reported): **70β80% top-5 retrieval accuracy**
- NeuralMind target on the same query set: exceed that baseline via semantic retrieval + learned cooccurrence reranking
The pytest regression gate (`tests/test_benchmark_regression.py`) currently enforces **β₯50% top-k hit rate** on the fixture plus **β₯4Γ reduction** (low because the fixture is tiny; real repos measure 10Γ higher).
---
## β FAQ
### How much does NeuralMind reduce Claude / GPT token costs?
Measured on real repos: **40β70Γ reduction per query** (see [Benchmarks](#-benchmarks)). For a team running 100 queries/day on Claude Sonnet, that is roughly **$450/month β $7/month**. Exact savings depend on codebase size and model pricing.
### Does NeuralMind work outside Claude Code?
Yes. The CLI works anywhere Python runs; the MCP server works with Cursor, Cline, Continue, Claude Desktop, and any MCP-compatible agent. For non-MCP tools like ChatGPT or Gemini, `neuralmind wakeup . | pbcopy` pipes context into a regular chat window. Only the PostToolUse compression hooks are Claude-Code-specific.
### Does my code leave my machine?
No. NeuralMind is fully offline β no API calls, no cloud services. Embeddings run locally via ChromaDB, and the knowledge graph is stored in `graphify-out/` in your project. Query memory (optional, opt-in) is written to `.neuralmind/` on disk.
### Is this RAG? How is it different from LangChain or LlamaIndex?
It is a form of RAG, but specialized for code. Instead of chunking text, NeuralMind retrieves over a **knowledge graph of code entities** (functions, classes, clusters) with a fixed 4-layer structure. That keeps the call graph intact and produces a token-budgeted output instead of a flat list of chunks. See [vs. LangChain/LlamaIndex](docs/comparisons/vs-langchain-llamaindex.md).
### I have a 1M context window now β do I still need this?
Long context makes it *possible* to stuff a whole repo in; it does not make it *cheap*. You still pay per input token, so a 50K-token repo at Claude Sonnet rates costs ~$0.15 **every turn**. NeuralMind drops that to ~$0.002. See [vs. long context](docs/comparisons/vs-long-context.md).
### Does it support my language?
Any language [graphify](https://github.com/dfrostar/graphify) supports (Python, JavaScript/TypeScript, and others via tree-sitter). NeuralMind consumes `graphify-out/graph.json` β if graphify can index it, NeuralMind can query it.
### What is the difference between `wakeup`, `query`, and `skeleton`?
- **`wakeup`** β ~400 tokens of project orientation (L0 + L1). Run it at session start.
- **`query`** β ~800β1,100 tokens for a specific natural-language question (L0βL3).
- **`skeleton`** β compact view of a single file (functions + call graph + cross-file edges). Use before `Read`.
### How does the PostToolUse compression work?
When `neuralmind install-hooks .` has been run, Claude Code invokes NeuralMind **after** every `Read`/`Bash`/`Grep` tool call but **before** the agent sees the output. `Read` becomes a skeleton (~88% smaller), `Bash` keeps errors + last 3 lines (~91% smaller), `Grep` caps at 25 matches. Set `NEURALMIND_BYPASS=1` on any command to opt out.
### Can I use NeuralMind without a knowledge graph?
No β the knowledge graph (`graphify-out/graph.json`) is the source of truth. Run `graphify update .` first, then `neuralmind build .`.
### Does it auto-update when I change code?
Only if you install the git post-commit hook with `neuralmind init-hook .`. Otherwise run `neuralmind build .` manually; it is incremental and only re-embeds changed nodes.
### What do I do if retrieval quality is poor on my repo?
1. Check that `neuralmind stats .` reports all your nodes indexed.
2. Run `neuralmind benchmark .` to see reduction ratios.
3. Enable query memory (it prompts on first TTY run) and periodically run `neuralmind learn .` β cooccurrence-based reranking improves relevance on your actual queries.
4. Open an issue with the query and expected result β retrieval quality is the thing we most want to improve.
---
## π Documentation
| Resource | Contents |
|----------|---------|
| **[Business Case](docs/BUSINESS-CASE.md)** | Fact-based ROI argument with provable claims, math you can plug your numbers into, and three concrete scenarios |
| **[Honest Assessment](docs/HONEST-ASSESSMENT.md)** | Skeptic's companion β when NeuralMind isn't worth installing, what the headline numbers don't measure |
| **[Enterprise Use Cases](docs/ENTERPRISE.md)** | Regulated industries, on-premise, multi-team β what to know before pitching internally |
| **[Setup Guide](https://github.com/dfrostar/neuralmind/wiki/Setup-Guide)** | First-time setup for Claude Code, Claude Desktop, Cursor, any LLM |
| **[CLI Reference](https://github.com/dfrostar/neuralmind/wiki/CLI-Reference)** | All commands and options |
| **[Scheduling Guide](https://github.com/dfrostar/neuralmind/wiki/Scheduling-Guide)** | Automate audits with Windows Task Scheduler, GitHub Actions, or cron |
| **[Version Strategy](docs/VERSION-STRATEGY.md)** | Versioning policy, breaking changes, support timeline, upgrade path |
| **[Compatibility Matrix](docs/COMPATIBILITY.md)** | Version compatibility, Python/platform support, known issues, migration guides |
| **[Learning Guide](https://github.com/dfrostar/neuralmind/wiki/Learning-Guide)** | Continual learning details |
| **[API Reference](https://github.com/dfrostar/neuralmind/wiki/API-Reference)** | Python API (`NeuralMind`, `ContextResult`, `TokenBudget`) |
| **[Architecture](https://github.com/dfrostar/neuralmind/wiki/Architecture)** | 4-layer progressive disclosure design |
| **[Integration Guide](https://github.com/dfrostar/neuralmind/wiki/Integration-Guide)** | MCP, CI/CD, VS Code, JetBrains |
| **[Troubleshooting](https://github.com/dfrostar/neuralmind/wiki/Troubleshooting)** | Common issues and fixes |
| **[Roadmap](ROADMAP.md)** | What's shipping next, where we want help, what's out of scope |
| **[Future-Proofing Plan](docs/FUTURE-PROOFING-PLAN.md)** | 8-initiative engineering plan for sustainability and scale |
| **[Brain-like Learning](docs/brain_like_learning.md)** | Design rationale for the learning system |
| **[Use Cases](docs/use-cases/README.md)** | Step-by-step walkthroughs: Claude Code, cost optimization, any-LLM, offline/regulated, growing monorepo, multi-agent (new in v0.6.0) |
| **[Release Notes v0.6.0](RELEASE_NOTES_v0.6.0.md)** | Live activity feed, cross-process JSONL bridge, pin UX, depth slider, replay overlay |
| **[Comparisons](docs/comparisons/README.md)** | NeuralMind vs. Cursor, Copilot, Cody, Aider, Claude Projects, LangChain, long context, prompt caching, RAG, tree-sitter |
| **[USAGE.md](USAGE.md)** | Extended usage examples |
---
## π€ Contributing
See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines and [ROADMAP.md](ROADMAP.md) for what we're working on next and where help is most welcome.
## π License
MIT License β see [LICENSE](LICENSE) for details.
---
β Star this repo if NeuralMind saves you money!