https://github.com/mage0535/hermes-memory-installer
Agent-agnostic memory sidecar for AI coding agents. One install, multi-agent shared recall, production-grade. AI智能体持久记忆系统,多Agent共享记忆层,生产级部署。
https://github.com/mage0535/hermes-memory-installer
agent-agnostic agent-memory ai-agent ai-infrastructure ai-memory bilingual claude-code coding-agent gbrain hermes-agent hindsight knowledge-graph llm-memory memory-sidecar multi-agent open-source persistent-memory rag semantic-search sidecar
Last synced: 2 days ago
JSON representation
Agent-agnostic memory sidecar for AI coding agents. One install, multi-agent shared recall, production-grade. AI智能体持久记忆系统,多Agent共享记忆层,生产级部署。
- Host: GitHub
- URL: https://github.com/mage0535/hermes-memory-installer
- Owner: mage0535
- Created: 2026-04-25T06:19:46.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2026-06-02T08:53:43.000Z (14 days ago)
- Last Synced: 2026-06-02T09:22:53.449Z (14 days ago)
- Topics: agent-agnostic, agent-memory, ai-agent, ai-infrastructure, ai-memory, bilingual, claude-code, coding-agent, gbrain, hermes-agent, hindsight, knowledge-graph, llm-memory, memory-sidecar, multi-agent, open-source, persistent-memory, rag, semantic-search, sidecar
- Language: Python
- Homepage: https://github.com/mage0535/hermes-memory-installer
- Size: 388 KB
- Stars: 111
- Watchers: 0
- Forks: 8
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-hermes - Hermes Memory Installer - click persistent memory stack with SQLite FTS5, lifecycle guards, and gbrain knowledge-graph sync. (Deployment and Operations / Plugins and add-ons)
README
# Memory Sidecar v3.1.1
**A production memory system for any AI agent. Keep knowledge across sessions, without touching agent internals.**
[](https://github.com/mage0535/hermes-memory-installer/releases)
[](https://github.com/mage0535/hermes-memory-installer/stargazers)
[](https://www.python.org/)
[](LICENSE)
[**中文文档**](README_CN.md) | [**Architecture**](ARCHITECTURE.md)
---
## What This Is
AI agents forget things. Every new session starts blank.
Memory Sidecar runs alongside your agent — Hermes, Claude Code, Cursor, Codex, whatever — and gives it a real memory. It saves important conversations, builds long-term knowledge, and feeds relevant context back when needed.
It doesn't patch the agent. It's a sidecar: separate process, shared data directory.
**Three things it actually does:**
1. **Archives sessions to permanent knowledge** — conversations aren't lost when you restart
2. **Recalls what matters** — layered retrieval: recent context → semantic search → knowledge graph
3. **Tracks important topics** — people, projects, recurring problems get their own "dossier"
## Architecture at a Glance
```
Agent writes sessions → state.db + session files
↓
Sidecar reads checkpoint, processes new sessions
↓
┌───────────┼───────────┐
│ │ │
▼ ▼ ▼
Hot Layer Warm Layer Cold Layer
(memory (Hindsight (gbrain graph
tool, PostgreSQL) + FTS5 search)
5KB cap)
↓
Tiered context injection → agent's system prompt
```
The full stack is documented in [ARCHITECTURE.md](ARCHITECTURE.md). Short version:
| Layer | What | Technology | Speed |
|-------|------|-----------|-------|
| Hot | Current user + system facts | memory tool injection | 0ms |
| Warm | Extracted facts, recurring patterns | Hindsight (PostgreSQL 16) | ~50ms |
| Cold | Permanent archive, knowledge graph | gbrain + FTS5 session search | ~500ms–2s |
We dropped the intermediate agentmemory bridge layer from earlier versions. It added Docker overhead with barely any data. The current three layers are simpler, faster, and more reliable.
## Quick Start
### What you need
- Python 3.9+
- [gbrain](https://github.com/hi-ogawa/gbrain) (knowledge graph, running on port 8787)
- [Hindsight](https://github.com/HindsightTechnologySolutions/hindsight) (fact store, port 8890)
- PostgreSQL 16 (backing store for both of the above)
- An AI agent producing sessions (Hermes, Claude Code, etc.)
### Install
```bash
git clone https://github.com/mage0535/hermes-memory-installer.git
cd hermes-memory-installer
# Set AGENT_HOME to point to your agent's data directory
export AGENT_HOME="$HOME/.hermes" # or ~/.claude, ~/.cursor, etc.
./install.sh
```
The installer will:
1. **Check your environment** — Python, PostgreSQL, Hindsight, gbrain reachability
2. **Let you pick an embedding model** — for semantic search (optional but recommended)
3. **Deploy sidecar scripts** — to `$AGENT_HOME/scripts/`
4. **Patch agent config** — adds memory provider settings if a config file is found
Non-interactive mode:
```bash
./install.sh --noninteractive --agent-home "$HOME/.my-agent"
```
### After Installing
```bash
# Run one archive pass
python3 $AGENT_HOME/scripts/session_to_gbrain.py --resume
# Run the full maintenance cycle
python3 $AGENT_HOME/scripts/memory_maintenance_cycle.py
# Verify everything works
python3 $AGENT_HOME/scripts/sidecar_acceptance_check.py
```
For ongoing operation, schedule the maintenance cycle via cron (or your agent's built-in scheduler). See [ARCHITECTURE.md](ARCHITECTURE.md) for recommended schedules.
## The Scripts
Seven scripts run the sidecar. All live in `$AGENT_HOME/scripts/` after install:
| Script | Role |
|--------|------|
| `session_to_gbrain.py` | Incremental session → gbrain archive with MCP API bridge |
| `memory_governance_rebuild.py` | Rebuild session index, hubs, canonical objects, vector index |
| `memory_guardian.py` | Capacity monitoring, backlog detection, stuck operation recovery |
| `memory_family_registry.py` | Query intent classification + Focused Dossier routing |
| `tiered_context_injector.py` | Layered recall: Hot → Warm → Cold → RRF fusion |
| `memory_maintenance_cycle.py` | Orchestrator: archive → rebuild → drain → recall → health |
| `sidecar_acceptance_check.py` | Production validation suite |
| `archive_sessions.py` | Bulk session archival to gbrain (in cron at 2am) |
| `auto_session_summary.py` | Session digest generation, runs every 6 hours |
**Running in production (cron):** `session_to_gbrain.py`, `archive_sessions.py`, `auto_session_summary.py`
**Available on-demand:** `memory_governance_rebuild.py`, `memory_guardian.py`, `memory_family_registry.py`, `tiered_context_injector.py`, `memory_maintenance_cycle.py`, `sidecar_acceptance_check.py`
## Focused Dossiers
Some things matter more than others. A key person. A long-running project. A recurring incident.
v3.1.0 lets you declare **Focused Dossiers** — high-priority memory profiles that get special treatment in recall. A dossier has:
- **aliases** — all the names it's referred to by
- **topic markers** — keywords that trigger dossier-first retrieval
- **retention priority** — don't let this get pruned
- **timeline tracking** — chronological entries for major events
The first production dossier is `kiki` — a relationship memory profile that demonstrated the pattern works at scale (hundreds of sessions, thousands of extracted facts, timeline-aware recall).
To add your own, edit `memory_family_registry.py` and add a new profile entry. The format is self-documenting in the file.
## Embedding Model Selection
Semantic search needs embeddings. The sidecar supports pluggable models via sentence-transformers.
During install, you pick one. The installer records your choice but doesn't deploy the model — you run the embedding service separately.
**How it affects retrieval:**
- Semantic matching catches meaning, not just keywords
- Cross-lingual: Chinese queries find English content
- Better clustering of related facts even when wording differs
**Supported models:**
| Model | Langs | Dim | Size | Best For |
|---|---|---|---|---|
| `intfloat/multilingual-e5-small` ★ | 100+ | 384d | ~470MB | Default. Balanced multilingual |
| `BAAI/bge-small-zh-v1.5` | Chinese | 512d | ~96MB | Tiny Chinese-first deployment |
| `paraphrase-multilingual-MiniLM-L12-v2` | 50+ | 384d | ~471MB | Mature ST ecosystem |
| `Alibaba-NLP/gte-multilingual-base` | 75+ | 768d | ~610MB | Higher recall, more RAM |
| `sentence-transformers/LaBSE` | 109 | 768d | ~471MB | Strong cross-lingual alignment |
| `BAAI/bge-m3` | 100+ | 1024d | ~2GB | Maximum precision, needs resources |
### Deploying the Embedding Service
```bash
pip install sentence-transformers flask
```
Minimal server:
```python
from sentence_transformers import SentenceTransformer
from http.server import HTTPServer, BaseHTTPRequestHandler
import json
model = SentenceTransformer("intfloat/multilingual-e5-small")
class Handler(BaseHTTPRequestHandler):
def do_POST(self):
length = int(self.headers.get("Content-Length", 0))
body = json.loads(self.rfile.read(length))
texts = body.get("input", [])
emb = model.encode(texts, normalize_embeddings=True).tolist()
self.send_response(200)
self.send_header("Content-Type", "application/json")
self.end_headers()
self.wfile.write(json.dumps(
{"data": [{"embedding": e} for e in emb]}
).encode())
HTTPServer(("127.0.0.1", 8766), Handler).serve_forever()
```
Set the URL and rebuild governance:
```bash
export EMBEDDING_API_URL=http://127.0.0.1:8766/v1/embeddings
python3 $AGENT_HOME/scripts/memory_maintenance_cycle.py
```
No embedding service? No problem — text-based retrieval (FTS5, LIKE, Hindsight, gbrain) works without it.
## Works With Any Agent
Memory Sidecar is agent-agnostic. It reads from `$AGENT_HOME/state.db` and session files, and operates entirely outside the agent process.
Tested with:
- **Hermes Agent** — original companion, 2+ months production
- **Claude Code** — via `AGENT_HOME=~/.claude`
- **Cursor / Codex** — shared data directory pattern
The installer respects `AGENT_HOME` (falls back to `HERMES_HOME` for backward compatibility). If your agent stores data somewhere non-standard, point `--agent-home` at it.
## Production Track Record
This isn't a prototype. The current stack has been running continuously on a production Hermes installation since April 2026:
- **10,885 gbrain pages** — full knowledge graph with timeline tracking
- **42,481 Hindsight nodes** — extracted facts with auto-retain/recall/reflect
- **105,601 indexed messages** — FTS5 searchable session archive
- **100% embedding coverage** — vector search across all content
- **brain score 73** — gbrain content quality metric
## Changelog
### v3.1.1 (2026-06-08)
- **New scripts**: `memory_watermark.py` (auto-detect memory capacity and archive stale entries) + `memory_snapshot_backup.py` (periodic snapshot backup)
- **Updated**: `hindsight-service.py` — simplified standalone daemon using existing PG on port 5432
- **Updated**: `hindsight_mcp_bridge.py` — clean line endings and improved MCP stdio bridge
- **Updated**: `session_to_gbrain.py` — env-var-based token config (no more hardcoded secrets)
- **Docs**: repo layout now reflects 9 scripts instead of 7
### v3.1.0 (2026-06-02)
- Simplified 4-tier → 3-tier architecture (removed agentmemory Docker bridge)
- Removed `memory_index.db` (semi-finished layer)
- Agent-agnostic via `AGENT_HOME` (not hardcoded to Hermes)
- Interactive embedding model selection during install
- Dry-run mode (`--dry-run`)
- Bilingual README (EN/CN) with language switching
- ARCHITECTURE_CN.md — Chinese architecture translation
- Focused Dossier model (first production instance: Kiki)
- Embedding Model selection guide (6 models)
- Production track record: 10.8K pages, 42K nodes, 100% embed coverage
### v3.0.0 (2026-05-29)
- Complete documentation overhaul
- Added HERMES_AUDIT_REPORT.md — comprehensive agent capability audit
- Optimized Chinese README for SoV/SEO
- 4-tier architecture: Hot → Warm → Cold → Archive
## Repository Layout
```
installer/ Entry point, config patching, environment checks
|scripts/ 9 supported sidecar scripts (incl. memory_watermark, memory_snapshot_backup)
|skills/ Agent-side memory skills (starter-kit, proactive, archivist)
templates/ Memory templates
```
## Acknowledgements
### Core Projects
- [Hermes Agent](https://github.com/NousResearch/hermes-agent) — the agent this sidecar was built alongside
- [Hindsight](https://github.com/HindsightTechnologySolutions/hindsight) — short/medium-term fact graph
- [gbrain](https://github.com/hi-ogawa/gbrain) — personal knowledge graph engine
- [sentence-transformers](https://www.sbert.net/) — embedding model framework
- [PostgreSQL](https://www.postgresql.org/) + [pgvector](https://github.com/pgvector/pgvector) — vector storage backbone
- [OpenCode](https://opencode.ai) — guided architecture and production iteration
### Embedding Models
- [intfloat/multilingual-e5-small](https://huggingface.co/intfloat/multilingual-e5-small)
- [BAAI/bge-small-zh-v1.5](https://huggingface.co/BAAI/bge-small-zh-v1.5)
- [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2)
- [Alibaba-NLP/gte-multilingual-base](https://huggingface.co/Alibaba-NLP/gte-multilingual-base)
- [sentence-transformers/LaBSE](https://huggingface.co/sentence-transformers/LaBSE)
- [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3)
### Community
Shoutout to everyone who filed issues, surfaced recall gaps, and pushed the design forward. GitHub Issues, Discussions, Reddit (r/LocalLLaMA, r/MachineLearning), V2EX, and direct production feedback all shaped v3.1.0.
---
If this project helps you, [drop a star ⭐](https://github.com/mage0535/hermes-memory-installer) — it helps others find it too.
## License
MIT. See bundled dependencies for their respective licenses.