https://github.com/basidiocarp/hyphae
Persistent memory and RAG for AI coding agents. Episodic memories with decay + durable concept graphs. MCP server + CLI. Rust.
https://github.com/basidiocarp/hyphae
ai-agents claude-code developer-tools llm mcp memory rag rust
Last synced: 8 days ago
JSON representation
Persistent memory and RAG for AI coding agents. Episodic memories with decay + durable concept graphs. MCP server + CLI. Rust.
- Host: GitHub
- URL: https://github.com/basidiocarp/hyphae
- Owner: basidiocarp
- License: mit
- Created: 2026-03-10T18:55:12.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2026-06-05T21:16:48.000Z (19 days ago)
- Last Synced: 2026-06-05T23:08:00.493Z (18 days ago)
- Topics: ai-agents, claude-code, developer-tools, llm, mcp, memory, rag, rust
- Language: Rust
- Size: 2.17 MB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 9
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
- Roadmap: docs/roadmap.md
- Agents: AGENTS.md
Awesome Lists containing this project
README
# Hyphae
Persistent memory for AI coding agents. Single binary, zero runtime
dependencies, MCP-native, and designed to keep useful context alive after the
window compacts.
Named after fungal hyphae, the branching filaments that connect and distribute
nutrients through the organism.
Part of the [Basidiocarp ecosystem](https://github.com/basidiocarp).
---
## The Problem
AI agents forget everything between sessions. Architecture decisions, resolved
bugs, project conventions, and prior corrections vanish when the transcript
compacts or the session ends.
## The Solution
Hyphae gives agents two memory models that do different jobs. Memories handle
the day-to-day flow of decisions, errors, and notes with decay; memoirs keep
the durable concept graph that should not disappear. On top of that, Hyphae
adds hybrid retrieval, document indexing, and session tracking.
---
## The Ecosystem
| Tool | Purpose |
|------|---------|
| **[hyphae](https://github.com/basidiocarp/hyphae)** | Persistent agent memory |
| **[canopy](https://github.com/basidiocarp/canopy)** | Multi-agent coordination runtime |
| **[cap](https://github.com/basidiocarp/cap)** | Web dashboard for the ecosystem |
| **[cortina](https://github.com/basidiocarp/cortina)** | Lifecycle signal capture and session attribution |
| **[lamella](https://github.com/basidiocarp/lamella)** | Skills, hooks, and plugins for coding agents |
| **[mycelium](https://github.com/basidiocarp/mycelium)** | Token-optimized command output |
| **[rhizome](https://github.com/basidiocarp/rhizome)** | Code intelligence via tree-sitter and LSP |
| **[spore](https://github.com/basidiocarp/spore)** | Shared transport and editor primitives |
| **[stipe](https://github.com/basidiocarp/stipe)** | Ecosystem installer and manager |
| **[volva](https://github.com/basidiocarp/volva)** | Execution-host runtime layer |
> **Boundary:** `hyphae` owns memory, retrieval, and session records. It does
> not own shell filtering, code intelligence, hook capture, UI, or installation.
> `hyphae-core` stays domain-only: types, traits, and embedder abstractions,
> with transport, operator, and persistence concerns living in sibling crates.
---
## Quick Start
```bash
# Quick install: smaller binary without local embeddings
curl -fsSL https://raw.githubusercontent.com/basidiocarp/hyphae/main/install.sh | sh
# Quick install: prebuilt binary with embeddings enabled
curl -fsSL https://raw.githubusercontent.com/basidiocarp/hyphae/main/install.sh | sh -s -- --embeddings
# Recommended: full ecosystem setup
stipe init
# Alternative: hyphae-only setup
hyphae init
```
```bash
# Build from source
cargo install --path crates/hyphae-cli
# Smaller build without embeddings
cargo build --release --no-default-features
# Full build with embeddings
cargo build --release
```
Prebuilt release archives now ship both variants:
- `hyphae-.tar.gz` or `.zip`: slim build without local embeddings
- `hyphae--embeddings.tar.gz` or `.zip`: default build with embeddings
---
## How It Works
```text
Agent Hyphae Stored state
───── ────── ────────────
store memory ─► episodic memory ─► decaying memories
store concept ─► memoir graph ─► permanent concepts
query context ─► hybrid retrieval ─► ranked recall
end session ─► session lifecycle ─► outcomes and lessons
```
1. Store episodic memories: capture decisions, errors, preferences, and session notes with importance-aware decay.
2. Build memoirs: link durable concepts into permanent knowledge graphs.
3. Index documents: chunk files, embed them, and store them for hybrid RAG retrieval.
4. Track sessions: record task context, outcomes, files changed, and feedback signals.
5. Recall useful context: blend BM25 and vector search into ranked results for agents and UIs.
---
## Memory Models
| Model | Behavior | Best for |
|-------|----------|----------|
| Memories | Decay-based episodic storage | Decisions, errors, preferences, work notes |
| Memoirs | Permanent semantic graph | Concepts, relationships, architecture, domain knowledge |
## Hybrid Search Stack
| Layer | Technology | Purpose |
|-------|------------|---------|
| Storage | SQLite | Memories, memoirs, embeddings, session state |
| Full-text | FTS5 | Keyword recall with BM25 scoring |
| Vector | sqlite-vec | Semantic recall over embeddings |
| Blend | 30% FTS plus 70% vector | Keyword precision plus semantic similarity |
---
## What Hyphae Owns
- Episodic memory storage and decay
- Permanent knowledge memoirs
- Hybrid document and memory retrieval
- Session lifecycle records and outcome signals
- Training-data export and lesson extraction
## What Hyphae Does Not Own
- Shell output filtering: handled by `mycelium`
- Code intelligence and symbol graphs: handled by `rhizome`
- Hook capture and session intake: handled by `cortina`
- UI and operator dashboards: handled by `cap`
---
## Key Features
- Dual memory model: combines decay-based episodic memory with permanent semantic memoirs.
- RAG pipeline: ingests files, chunks them, embeds them, and serves them back through hybrid search.
- Structured sessions: records session start, end, context, and feedback signals.
- Lesson extraction: mines corrections and resolutions into reusable patterns.
- Local-first storage: runs from a single SQLite database with no cloud dependency.
---
## Architecture
```text
hyphae (single binary)
├── hyphae-core domain types, traits, embedder logic only
├── hyphae-ingest file readers and chunking
├── hyphae-store SQLite, FTS5, sqlite-vec
├── hyphae-mcp MCP server and tool handlers
└── hyphae-cli CLI commands and operator surfaces
```
Versioned payloads stay explicit at the boundary. MCP tools that cross repo or
host boundaries use schema/version fields instead of ad hoc shapes, and shared
contract updates should land with their schema or fixture changes.
```bash
hyphae session start --project demo --task "refactor auth flow"
hyphae session end --id --summary "completed refactor"
hyphae feedback signal --session-id --type correction --value -1
hyphae session context --project demo
hyphae bench-retrieval # Benchmark retrieval quality using fixture-driven tests
```
---
## Performance
| Operation | Latency |
|-----------|---------|
| Store | 34 us |
| FTS search | 47 us |
| Hybrid search | 951 us |
| Batch decay (1000) | 5.8 ms |
## Logging
Hyphae reads `HYPHAE_LOG` first, then falls back to `RUST_LOG`. If neither is
set, it defaults to `warn`.
```bash
HYPHAE_LOG=debug hyphae doctor
HYPHAE_LOG=debug hyphae serve
```
`hyphae serve` keeps stdout reserved for newline-delimited MCP JSON-RPC
responses. Logs go to stderr so they do not corrupt the MCP transport.
---
## Documentation
- [docs/README.md](docs/README.md): docs index and reading order
- [docs/guide.md](docs/guide.md): quickstart, concepts, and configuration
- [docs/features.md](docs/features.md): feature overview and behavior
- [docs/cli-reference.md](docs/cli-reference.md): CLI commands and examples
- [docs/mcp-tools.md](docs/mcp-tools.md): MCP tool reference
- [docs/architecture.md](docs/architecture.md): internals, schema, and search pipeline
- [docs/setup-by-tool.md](docs/setup-by-tool.md): per-editor setup instructions
- [docs/troubleshooting.md](docs/troubleshooting.md): common issues and fixes
- [docs/feedback-loop-design.md](docs/feedback-loop-design.md): closed-loop learning design notes
- [docs/training-data.md](docs/training-data.md): export formats and training data guidance
## Supply Chain
Embedding support pulls an ML dependency chain. The table below documents what
each component does and when its artifacts arrive.
| Component | Version | What it does | When artifacts are fetched |
|-----------|---------|--------------|---------------------------|
| `fastembed` | 4.x | Runs embedding models locally | Build time |
| `ort` | 2.0.0-rc.9 | Rust bindings for ONNX Runtime | Build time |
| `ort-sys` | 2.0.0-rc.9 | Native FFI layer for ORT | Build time |
| ORT binary | matches ort-sys | Prebuilt ONNX Runtime native library | Build time (via `ort-download-binaries` feature) |
| Embedding model | all-MiniLM-L6-v2 | Sentence embedding weights | First use (downloaded from Hugging Face) |
The `ort-download-binaries` feature (enabled in `[workspace.dependencies]`) causes
`ort-sys` to download a prebuilt ORT native library from the `ort` GitHub releases
page during the Rust build. The embedding model weights are downloaded from Hugging
Face on first use by `fastembed`.
**To disable all binary downloads**, build without default features:
```bash
cargo build --release --no-default-features
```
This produces a slim binary that uses FTS5 search only; no native library or model
weights are downloaded. Hybrid vector search is unavailable in this mode.
**Pinning**: the `Cargo.lock` fixes all crate versions including `ort-sys`. The ORT
native binary version is determined by `ort-sys`; update `ort-sys` deliberately and
verify the new binary hash in the published `ort-sys` source before accepting the
upgrade.
---
## Development
```bash
cargo build --release
cargo nextest run
cargo test
cargo clippy
cargo fmt
```
- Prefer `cargo nextest run` for the normal test loop.
- The workspace `profile.dev` is tuned for faster iteration with line-tables-only
debug info.
- If you add `criterion` here, start with the retrieval hot paths in
`hyphae-store` rather than broad repo-wide benches.
- Use whole-command timing for end-to-end investigation, for example
`time cargo run -p hyphae-cli -- doctor`.
## License
MIT