{"id":51016875,"url":"https://github.com/eleboucher/memini","last_synced_at":"2026-06-21T11:31:58.614Z","repository":{"id":363621159,"uuid":"1263797030","full_name":"eleboucher/memini","owner":"eleboucher","description":"Give any MCP-capable agent persistent memory: remember/recall over a tiered store with hybrid vector + keyword retrieval. Single Go binary, SQLite or Postgres, embedded admin UI.","archived":false,"fork":false,"pushed_at":"2026-06-16T18:53:37.000Z","size":1377,"stargazers_count":5,"open_issues_count":5,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-16T20:24:10.040Z","etag":null,"topics":["agent-memory","ai-agents","bm25","golang","hybrid-search","mcp","model-context-protocol","rag","sqlite-vec","vector-search"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/eleboucher.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null},"funding":{"github":"eleboucher","ko_fi":"eleboucher","buy_me_a_coffee":"eleboucher"}},"created_at":"2026-06-09T09:20:36.000Z","updated_at":"2026-06-16T18:38:17.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/eleboucher/memini","commit_stats":null,"previous_names":["eleboucher/memini"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/eleboucher/memini","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eleboucher%2Fmemini","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eleboucher%2Fmemini/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eleboucher%2Fmemini/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eleboucher%2Fmemini/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/eleboucher","download_url":"https://codeload.github.com/eleboucher/memini/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eleboucher%2Fmemini/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34608892,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-21T02:00:05.568Z","response_time":54,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent-memory","ai-agents","bm25","golang","hybrid-search","mcp","model-context-protocol","rag","sqlite-vec","vector-search"],"created_at":"2026-06-21T11:31:56.795Z","updated_at":"2026-06-21T11:31:58.602Z","avatar_url":"https://github.com/eleboucher.png","language":"Go","funding_links":["https://github.com/sponsors/eleboucher","https://ko-fi.com/eleboucher","https://buymeacoffee.com/eleboucher"],"categories":[],"sub_categories":[],"readme":"# memini\n\n\u003e A shared, persistent memory service for AI agents.\n\n`memini` gives any [MCP](https://modelcontextprotocol.io)-capable agent (Claude Code,\nopencode, Codex, Hermes, OpenClaw, Open WebUI) one place to `remember` and `recall`,\nwith retrieval quality that compounds over time. It runs as a single Go binary, boots\nwith zero configuration, and scales from an embedded SQLite file on a laptop to Postgres\nin Kubernetes.\n\n## Contents\n\n- [How it works](#how-it-works)\n- [Quick start](#quick-start)\n- [Agent plugin](#agent-plugin)\n- [Running in Docker](#running-in-docker)\n- [Using it as an MCP server](#using-it-as-an-mcp-server)\n- [Configuration](#configuration)\n- [Web UI](#web-ui)\n- [Answering](#answering)\n- [Reranking](#reranking)\n- [Importing existing memories](#importing-existing-memories)\n- [Benchmarks](#benchmarks)\n- [License](#license)\n\n## How it works\n\nmemini draws on three earlier projects:\n\n- A curated, deduplicated artifact rather than a pile of chunks (after Karpathy's\n  \"LLM wiki\").\n- Tiered memory (working → episodic → semantic → procedural) with decay and hybrid\n  (vector + keyword) retrieval fused with Reciprocal Rank Fusion (after `agentmemory`).\n  See [docs/tiers.md](docs/tiers.md) for what each tier means and how memories move\n  between them.\n- A stateless, K8s-native HTTP service with an opt-in LLM consolidation pipeline,\n  per-memory TTLs, per-tenant isolation, Prometheus metrics, and an `fsck` consistency\n  checker (after `mnemory`).\n\nHybrid results are re-ranked by a composite of relevance, access recency, and importance\n(not similarity alone), and near-duplicates are collapsed at recall time.\n\nWhen an LLM is configured, writes are stored immediately and then deduplicated and\ncontradiction-resolved in the background (a similarity gate skips the LLM when nothing\nclose exists), and frequently-recalled episodic memories are periodically distilled into\ndurable semantic facts.\n\n### Design\n\n| Concern    | Choice                                                                                                                                                      |\n| ---------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------- |\n| Language   | Go: single static binary, tiny image, low memory                                                                                                            |\n| Storage    | Pluggable: **sqlite-vec** (embedded, default) or **Postgres + VectorChord** (scale)                                                                         |\n| Embeddings | External OpenAI-compatible endpoint (you deploy the model)                                                                                                  |\n| LLM        | **Opt-in**: runs headless without one; enables background dedup, consolidation, and episodic→semantic promotion when configured                             |\n| Ranking    | Hybrid (vector + keyword) RRF, re-ranked by relevance + recency + importance, deduplicated                                                                  |\n| Interfaces | REST (server + UI types generated from [`api/openapi.yaml`](api/openapi.yaml)) + MCP (stdio \u0026 Streamable HTTP) + embedded web UI, sharing one service layer |\n\n## Quick start\n\nmemini boots with zero configuration in its embedded (SQLite) mode. Vector search needs\nan embeddings endpoint, so point it at any OpenAI-compatible embeddings API:\n\n```sh\nexport MEMINI_EMBED_BASE_URL=http://localhost:8081/v1\nexport MEMINI_EMBED_MODEL=bge-m3\nexport MEMINI_EMBED_DIMS=1024\nmise run run\ncurl -s localhost:8080/healthz\n```\n\n## Agent plugin\n\nAll plugins need a running memini (embeddings configured). To connect, set the\nbase URL and token (if your server requires auth). Default URL is always\n`http://localhost:8080`.\n\nEvery integration reads the same canonical env vars, so one setup works\neverywhere: **`MEMINI_BASE_URL`** for the server and **`MEMINI_API_KEY`** for the\ntoken. The legacy names **`MEMINI_URL`** and **`MEMINI_TOKEN`** are still accepted\nas aliases. Where a plugin has its own config (opencode options, Open WebUI\nValves, `openclaw.json`), that config wins over the env.\n\n| Agent       | Base URL config                                       | Token (if auth)                |\n| ----------- | ----------------------------------------------------- | ------------------------------ |\n| Claude Code | `MEMINI_BASE_URL` (MCP endpoint: `MEMINI_MCP_URL`)    | `MEMINI_API_KEY`               |\n| Codex CLI   | MCP config                                            | MCP config                     |\n| opencode    | `MEMINI_BASE_URL` or inline `base_url`                | `MEMINI_API_KEY`               |\n| Hermes      | `MEMINI_BASE_URL`                                     | `MEMINI_API_KEY`               |\n| Open WebUI  | `base_url` Valve (defaults from `MEMINI_BASE_URL`)    | `MEMINI_API_KEY` (process env) |\n| OpenClaw    | `base_url` in `openclaw.json`, else `MEMINI_BASE_URL` | `MEMINI_API_KEY` (gateway env) |\n\nFull details and edge cases live in [`integrations/`](integrations/).\n\n**Claude Code:**\n\n```\n/plugin marketplace add eleboucher/memini\n/plugin install memini\n```\n\n**opencode:** add the plugin to `opencode.json` (or `~/.config/opencode/opencode.json`):\n\n```json\n{\n  \"plugin\": [\"@eleboucher/opencode-memini\"]\n}\n```\n\n**Hermes:**\n\n```sh\nhermes plugins install eleboucher/memini-hermes\n```\n\n**Open WebUI:** paste [`filter/memini_memory.py`](integrations/openwebui/filter/memini_memory.py) into Admin Panel → Functions → `+`, and optionally [`tools/memini_tools.py`](integrations/openwebui/tools/memini_tools.py) into Workspace → Tools for on-demand access.\n\n**OpenClaw:**\n\n```sh\nopenclaw plugins install clawhub:@eleboucher/memini\n```\n\n**Codex CLI:** MCP only — no plugin; wire the `memini mcp` server directly: see\n[`integrations/codex/`](integrations/codex/).\n\nOr wire any agent to the MCP server without a plugin: see [`integrations/`](integrations/).\n\n## Running in Docker\n\n### Full local stack with Compose\n\n[`compose.yaml`](compose.yaml) brings up everything you need to try memini on a laptop:\nPostgres + VectorChord, a CPU embeddings server (`text-embeddings-inference` serving\n`bge-small-en-v1.5`, 384-d), and memini itself wired to both.\n\n```sh\ndocker compose up --build      # builds the image, starts db + embeddings + memini\ncurl -s localhost:8080/healthz # -\u003e ok, once the db healthcheck passes\nopen http://localhost:8080/    # embedded admin UI\n```\n\nmemini is reachable at `http://localhost:8080` (REST + MCP + UI). To enable the opt-in\nLLM pipeline (background dedup/consolidation, `/v1/answer`, `llm` rerank), uncomment\n`MEMINI_LLM_BASE_URL` / `MEMINI_LLM_MODEL` in the `memini` service and point them at any\nOpenAI-compatible chat endpoint. `docker compose down -v` tears it down and drops the\nPostgres volume.\n\n### Single container (SQLite mode)\n\nFor a self-contained server with no Postgres, run the image in its default embedded\n(SQLite) mode. Just give it a volume for the database and an embeddings endpoint to talk\nto:\n\n```sh\ndocker build -t memini .       # or use a prebuilt image if you publish one\ndocker run --rm -p 8080:8080 \\\n  -v memini-data:/data \\\n  -e MEMINI_SQLITE_PATH=/data/memini.db \\\n  -e MEMINI_EMBED_BASE_URL=http://host.docker.internal:8081/v1 \\\n  -e MEMINI_EMBED_MODEL=bge-small-en-v1.5 \\\n  -e MEMINI_EMBED_DIMS=384 \\\n  memini\n```\n\nThe image runs as a non-root user (`65532`); the named volume keeps memories across\nrestarts. On Linux, swap `host.docker.internal` for the host IP (or add\n`--add-host=host.docker.internal:host-gateway`) to reach an embeddings server running on\nthe host.\n\n## Using it as an MCP server\n\nmemini speaks the Model Context Protocol so agents can `remember` / `recall` / `answer`:\n\n- **Remote (Streamable HTTP):** `http://\u003chost\u003e:8080/mcp`\n- **Local (stdio):** `memini mcp`\n\nFor a **shared, always-on** server, run it over HTTP (the Compose or single-container\nsetups above already expose `/mcp` at `http://localhost:8080/mcp`) and point agents at\nthat URL.\n\nFor a **stdio** MCP server the agent spawns per session, run `memini mcp` in the container\nwith `-i` (keep stdin open) and no published port:\n\n```sh\ndocker run -i --rm \\\n  -v memini-data:/data \\\n  -e MEMINI_SQLITE_PATH=/data/memini.db \\\n  -e MEMINI_EMBED_BASE_URL=http://host.docker.internal:8081/v1 \\\n  -e MEMINI_EMBED_MODEL=bge-small-en-v1.5 -e MEMINI_EMBED_DIMS=384 \\\n  memini mcp\n```\n\nWire that into any MCP client as the launch command, e.g. for Claude Code / opencode:\n\n```json\n{\n  \"mcpServers\": {\n    \"memini\": {\n      \"command\": \"docker\",\n      \"args\": [\n        \"run\",\n        \"-i\",\n        \"--rm\",\n        \"-v\",\n        \"memini-data:/data\",\n        \"-e\",\n        \"MEMINI_SQLITE_PATH=/data/memini.db\",\n        \"-e\",\n        \"MEMINI_EMBED_BASE_URL=http://host.docker.internal:8081/v1\",\n        \"-e\",\n        \"MEMINI_EMBED_MODEL=bge-small-en-v1.5\",\n        \"-e\",\n        \"MEMINI_EMBED_DIMS=384\",\n        \"memini\",\n        \"mcp\"\n      ]\n    }\n  }\n}\n```\n\nThis works as-is: memory lands in the `default` namespace. A detached container can't\nauto-detect the agent's repo the way the [plugin](plugin/) does, so for per-project\nisolation set `MEMINI_DEFAULT_NAMESPACE` (or pass a `namespace` argument per tool call).\n\nReady-to-paste configs for Claude Code, opencode, Codex, Hermes, OpenClaw, and Open WebUI\n(plus the shared cross-agent namespace trick) live in [`integrations/`](integrations/).\nFor Claude Code and Codex, prefer the [plugin/](plugin/), which auto-captures tool calls\nand injects prior context at session start.\n\n## Configuration\n\nmemini is configured entirely through environment variables (12-factor).\n\n| Env var                          | Default                  | Description                                                                                                                                                                                                                                                                                                                  |\n| -------------------------------- | ------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n| `MEMINI_HTTP_ADDR`               | `:8080`                  | HTTP listen address                                                                                                                                                                                                                                                                                                          |\n| `MEMINI_SHUTDOWN_TIMEOUT`        | `15s`                    | graceful HTTP shutdown budget on SIGTERM                                                                                                                                                                                                                                                                                     |\n| `MEMINI_BACKEND`                 | `sqlite`                 | `sqlite` or `postgres`                                                                                                                                                                                                                                                                                                       |\n| `MEMINI_SQLITE_PATH`             | `memini.db`              | sqlite database path                                                                                                                                                                                                                                                                                                         |\n| `MEMINI_POSTGRES_DSN`            | —                        | required when `MEMINI_BACKEND=postgres`                                                                                                                                                                                                                                                                                      |\n| `MEMINI_EMBED_BASE_URL`          | —                        | OpenAI-compatible embeddings endpoint                                                                                                                                                                                                                                                                                        |\n| `MEMINI_EMBED_MODEL`             | `text-embedding-3-small` | embedding model name                                                                                                                                                                                                                                                                                                         |\n| `MEMINI_EMBED_API_KEY`           | —                        | bearer token for the embeddings endpoint (optional)                                                                                                                                                                                                                                                                          |\n| `MEMINI_EMBED_DIMS`              | `1536`                   | embedding dimensions (must match model)                                                                                                                                                                                                                                                                                      |\n| `MEMINI_EMBED_QUERY_PREFIX`      | —                        | instruction prepended to recall queries for asymmetric embedders (documents stay bare), e.g. Qwen3-Embedding's `Instruct: Given a user query, retrieve relevant memories that answer it\\nQuery:`                                                                                                                             |\n| `MEMINI_EMBED_MAX_BATCH`         | `20`                     | max items per `/embeddings` request (match your server's max client batch; TEI defaults to 32)                                                                                                                                                                                                                               |\n| `MEMINI_EMBED_MAX_BATCH_CHARS`   | `24000`                  | max total characters per `/embeddings` request (`0` disables)                                                                                                                                                                                                                                                                |\n| `MEMINI_EMBED_MAX_ITEM_CHARS`    | `8000`                   | truncate each text to this many characters before embedding (`0` disables)                                                                                                                                                                                                                                                   |\n| `MEMINI_REEMBED_ON_MODEL_CHANGE` | `false`                  | when `MEMINI_EMBED_MODEL` differs from the model the stored vectors were produced with, re-embed every memory at startup instead of refusing to start (blocks startup; one embeddings call per memory). Off by default — use the `memini reembed` command for an explicit, observable pass. Dims still can't change this way |\n| `MEMINI_FUSION_ALPHA`            | `0.5`                    | hybrid score-fusion weight on the vector leg (`0.5` balanced, higher favors vector); negative falls back to RRF                                                                                                                                                                                                              |\n| `MEMINI_WRITE_DEDUP_MIN_SCORE`   | `0`                      | coalesce a write into a same-tier memory at or above this vector similarity instead of storing a near-duplicate (`0` disables; ~`0.9` collapses near-identical restatements)                                                                                                                                                 |\n| `MEMINI_WRITE_DEDUP_FINGERPRINT` | `true`                   | reinforce a same-tier memory when a write's normalized content matches it exactly, before embedding (`false` stores every write verbatim)                                                                                                                                                                                    |\n| `MEMINI_TEMPORAL_BOOST`          | `0.40`                   | boost candidates dated near a relative time named in the query (e.g. \"3 weeks ago\") by up to this much; `0` disables                                                                                                                                                                                                         |\n| `MEMINI_LLM_BASE_URL`            | —                        | opt-in LLM endpoint; empty disables it                                                                                                                                                                                                                                                                                       |\n| `MEMINI_LLM_API_KEY`             | —                        | bearer token for the LLM endpoint (optional)                                                                                                                                                                                                                                                                                 |\n| `MEMINI_LLM_API`                 | `openai`                 | chat backend: `openai` or `anthropic` (e.g. MiniMax)                                                                                                                                                                                                                                                                         |\n| `MEMINI_LLM_MODEL`               | `gpt-4o-mini`            | consolidation model name                                                                                                                                                                                                                                                                                                     |\n| `MEMINI_RERANK`                  | `off`                    | recall reranking: `off`, `llm`, or a cross-encoder `/rerank` URL (Infinity, vLLM, or `llama-server --rerank`); failures fall back to the composite order                                                                                                                                                                     |\n| `MEMINI_RERANK_MODEL`            | —                        | cross-encoder model name (when `MEMINI_RERANK` is a URL)                                                                                                                                                                                                                                                                     |\n| `MEMINI_RERANK_API_KEY`          | —                        | cross-encoder endpoint auth (when `MEMINI_RERANK` is a URL; optional)                                                                                                                                                                                                                                                        |\n| `MEMINI_RERANK_TIMEOUT`          | `10s`                    | per-recall timeout on the reranker call; on timeout recall falls back to the composite order                                                                                                                                                                                                                                 |\n| `MEMINI_RERANK_MAX_DOC_CHARS`    | `2048`                   | truncate each document to this many characters before reranking, so one oversized memory can't exceed the server's batch (`0` disables). `2048` covers a typical memory in full; the longest query+doc is ≈800 tokens, so the reranker server needs `--ubatch-size` ≥ ~1024                                                  |\n| `MEMINI_RERANK_MAX_BATCH_CHARS`  | `6000`                   | cap the total query+documents characters per `/rerank` request; the pool is split across multiple requests when it would exceed this (`6000` keeps ~2 max-size docs per request). `0` disables                                                                                                                               |\n| `MEMINI_CONSOLIDATE_MODE`        | `async`                  | `async` (store now, dedup in background), `sync`, or `off`                                                                                                                                                                                                                                                                   |\n| `MEMINI_CONSOLIDATE_MIN_SCORE`   | `0.6`                    | similarity gate: skip the LLM when the nearest candidate scores below it (`0` disables)                                                                                                                                                                                                                                      |\n| `MEMINI_CONSOLIDATE_QUEUE_CAP`   | `1024`                   | bound on the async consolidation queue; writes never block (jobs dropped when full)                                                                                                                                                                                                                                          |\n| `MEMINI_PROMOTE_INTERVAL`        | `24h`                    | how often frequently-used episodic memories are distilled into semantic facts (`0` disables; needs LLM)                                                                                                                                                                                                                      |\n| `MEMINI_PROMOTE_MIN_ACCESS`      | `3`                      | minimum recall count before an episodic memory is eligible for promotion                                                                                                                                                                                                                                                     |\n| `MEMINI_SWEEP_INTERVAL`          | `1h`                     | how often the decay sweeper purges expired memories                                                                                                                                                                                                                                                                          |\n| `MEMINI_SHORT_TERM_CAP`          | `1000`                   | per-namespace cap on short-term (working+episodic) memories; the sweeper evicts the lowest-retention over it (`0` disables)                                                                                                                                                                                                  |\n| `MEMINI_TOMBSTONE_TTL`           | `0`                      | sweeper hard-deletes tombstoned memories older than this TTL (`0` keeps them indefinitely); the one irreversible maintenance action                                                                                                                                                                                          |\n| `MEMINI_DEMOTE_AFTER`            | `0`                      | sweeper demotes never-recalled, low-importance durable memories older than this back to episodic (`0` disables)                                                                                                                                                                                                              |\n| `MEMINI_DEDUP_INTERVAL`          | `24h`                    | how often the store-wide dedup pass collapses near-duplicate clusters to one representative (rest tombstoned reversibly); `0` disables. Also on-demand via `POST /v1/dedup`                                                                                                                                                  |\n| `MEMINI_DEDUP_SIMILARITY`        | `0.85`                   | cosine-like threshold for cluster membership; higher is stricter                                                                                                                                                                                                                                                             |\n| `MEMINI_DEDUP_MIN_CLUSTER_SIZE`  | `2`                      | smallest cluster acted on                                                                                                                                                                                                                                                                                                    |\n| `MEMINI_DEDUP_NEIGHBOURS`        | `20`                     | per-anchor vector-search fan-out bounding the cluster width                                                                                                                                                                                                                                                                  |\n| `MEMINI_DEDUP_TIERS`             | —                        | comma-separated tiers to restrict the periodic pass to (`working,episodic,semantic,procedural`); empty means all                                                                                                                                                                                                             |\n| `MEMINI_API_KEY`                 | —                        | if set, required as a bearer token (also gates `/metrics`)                                                                                                                                                                                                                                                                   |\n| `MEMINI_UI_ENABLED`              | `true`                   | mount the embedded admin UI at `/` (`false` for a headless API/MCP-only service)                                                                                                                                                                                                                                             |\n| `MEMINI_NAMESPACE_HEADER`        | `X-Memini-Namespace`     | header used to scope tenants                                                                                                                                                                                                                                                                                                 |\n| `MEMINI_DEFAULT_NAMESPACE`       | auto                     | fallback namespace (see [Namespace resolution](#namespace-resolution))                                                                                                                                                                                                                                                       |\n| `MEMINI_LOG_LEVEL`               | `info`                   | `debug` / `info` / `warn` / `error`                                                                                                                                                                                                                                                                                          |\n| `MEMINI_LOG_FORMAT`              | `json`                   | `json` or `text`                                                                                                                                                                                                                                                                                                             |\n\n### Namespace resolution\n\nA request's namespace is taken from `X-Memini-Namespace` (configurable via\n`MEMINI_NAMESPACE_HEADER`). The authoritative source of that header is the\n[plugin/](plugin/): each hook script resolves the namespace from the agent's working\ndirectory via `git rev-parse --show-toplevel` and sends it on every call. That is what\nmakes HTTP mode \"just work\" across projects without per-project config.\n\nWhen the header is absent (for example a stdio MCP launch without the plugin, or an HTTP\ncall that forgot to set it), the server falls back to the same resolver at startup time,\nin this order:\n\n1. `MEMINI_DEFAULT_NAMESPACE` (or `MEMINI_NAMESPACE`) env var, if non-empty.\n2. `git rev-parse --show-toplevel` in the server's cwd, using the repo basename, e.g.\n   `memini` for `/home/dev/memini`.\n3. `basename(cwd)` if the cwd is not inside a git worktree.\n4. Literal `default` as a last resort.\n\nThe resolved value and its source (`env` / `git` / `cwd` / `fallback`) are logged at\nstartup, e.g.:\n\n```json\n{\"level\":\"INFO\",\"msg\":\"starting memini\",\"default_namespace\":\"memini\",\"namespace_source\":\"git\",...}\n```\n\nIn **HTTP mode**, the server-side auto-resolve is misleading: the server runs detached\nfrom the agent's cwd, so the resolved basename reflects _the server's_ project, not the\nagent's. Install the plugin (or send the header explicitly per request) to get the right\nnamespace. In **stdio mode** the server inherits the agent's cwd, so the fallback is\ncorrect.\n\n## Web UI\n\nmemini ships an embedded admin UI (Preact + Vite, compiled into the binary) served at `/`.\nIt needs no separate process; open `http://localhost:8080/`.\n\n- **Overview** — per-namespace stats and a tier \"strata\" bar (working → episodic →\n  semantic → procedural).\n- **Browser** — paginated, tier/expired/superseded-filterable list with a detail drawer\n  and delete.\n- **Search** — hybrid recall with relevance scores.\n- **Graph** — D3 force-directed view; edges are supersession (directed) and shared-tag\n  affinity.\n- **Health** — runs `fsck` and surfaces duplicate clusters.\n\nUse the namespace switcher (top bar) to change tenant, and **Settings** to set a bearer\ntoken (sent as `Authorization: Bearer …`) or point the UI at a remote `memini`. The static\nshell is unauthenticated so you can enter a token; the `/v1` API it calls still enforces\n`MEMINI_API_KEY`. Disable the whole thing with `MEMINI_UI_ENABLED=false`.\n\n\u003e [!WARNING]\n\u003e When `MEMINI_API_KEY` is set, the server embeds the key in the UI shell so the\n\u003e same-origin UI authenticates without pasting it, which means anyone who can load `/` can\n\u003e read the key. Only expose the UI where reaching it already implies trust, or set\n\u003e `MEMINI_UI_ENABLED=false` on untrusted networks.\n\nThe UI is backed by three read-only endpoints alongside the core API: `GET /v1/memories`\n(list with `tier`/`include_expired`/`include_superseded`/`limit` filters), `GET\n/v1/stats`, and `GET /v1/namespaces`.\n\nThe UI sources live in [`ui/`](ui/); build the embedded bundle with `mise run ui` (or\niterate with HMR via `mise run ui-dev`, which proxies `/v1` to a local server on `:8080`).\nThe built bundle under `internal/api/ui/dist/` is a gitignored build artifact: the Docker\nimage builds it, while a plain `go build` without it still works and serves a placeholder\npage.\n\n## Answering\n\nBeyond raw recall, `POST /v1/answer` `{query, limit}` retrieves memories and has the LLM\ngenerate a grounded answer from them, returning the answer plus the supporting `sources`\n(requires an LLM; also exposed as the `memory_answer` MCP tool).\n\n## Reranking\n\n`MEMINI_RERANK` adds an optional read-side rerank over the hybrid candidates (`off`, a\ncross-encoder `/rerank` URL served by Infinity / vLLM / `llama-server --rerank`, or\n`llm`). See the [benchmark table](#benchmarks) for measured numbers across every config\nand dataset. Two things worth knowing:\n\n- Reranking only helps where base recall has headroom. On session-level sets hybrid is\n  already at ~98–99%, so reranking is a no-op. On turn-level LoCoMo (gold = exact turns) it\n  pays off: +11pp R@5 / +17pp MRR (cross-encoder) or +15pp / +25pp (LLM).\n- The cross-encoder is the better default when you need it: most of the LLM's lift at a\n  fraction of the latency, a tiny 0.6B model, and no chat dependency. Use `llm` only if you\n  already run a chat model and want the last few points.\n\n## Importing existing memories\n\n`memini import` loads an export from `agentmemory`, `mem0`, `mnemory`, memini's own\nformat, or your **Claude Code session history**, into the local store or a running server.\n\n```sh\n# Local store (embeds + preserves source IDs, timestamps, tiers):\nmemini import --source agentmemory ./agentmemory-export.json\n\n# Remote server over REST:\nmemini import --source mem0 --remote https://memini.example.com \\\n  --token \"$MEMINI_API_KEY\" --namespace my-project ./mem0-export.json\n\n# Backfill Claude Code history: each user→assistant exchange becomes one\n# episodic memory, scoped to the project namespace (the transcript's cwd\n# basename). Accepts a single transcript, a project dir, or all projects:\nmemini import --source claude-code ~/.claude/projects\n```\n\nThe `claude-code` source reconstructs verbatim exchanges from session transcripts\n(`~/.claude/projects/\u003cproject\u003e/\u003csession\u003e.jsonl`), skipping tool-result noise, sidechains,\nand slash-command wrappers. IDs are deterministic, so re-importing is idempotent.\nBackfilled memories get a fresh 90-day episodic TTL (so old history isn't swept on\narrival) while keeping the original timestamp for recency ranking. This pairs with the\n[plugin](plugin/)'s auto-capture: backfill once, then the hooks keep it current.\n\nEach source's fields map onto memini's tiers (e.g. agentmemory `workflow`→procedural, mem0\nfacts→semantic) and namespace (`project`/`user_id`). Records whose source carries no\nrecognized tier default to **episodic** (90-day TTL), so a bulk import of unknown quality\nages out unless recall reinforces it rather than living forever as durable facts. Empty\nrecords are skipped; per-record failures don't abort the run. Over `--remote` the server\nsets its own timestamps, so the source's created-at is kept in\n`metadata.imported_created_at`. Reads stdin when the path is `-`.\n\nFor low-quality bulk exports, two optional gates drop weak records before they're written\n(both off by default):\n\n```sh\n# Skip stubs shorter than 40 bytes and anything below importance 0.3:\nmemini import --source mem0 --min-length 40 --min-importance 0.3 ./export.json\n```\n\nNote `--min-importance` skips records whose source reported no importance (they arrive as\n`0`); leave it off unless your export carries real importance scores.\n\n## Switching embedding models\n\nVectors from different embedding models aren't comparable, so memini records which model\nproduced a store's vectors and **refuses to start** when `MEMINI_EMBED_MODEL` later differs\n— otherwise a same-dimension model swap would silently degrade recall with no error. To\nmigrate a store to a new model in place:\n\n```sh\n# dry-run: report how many memories would be re-embedded\nMEMINI_EMBED_MODEL=new-model memini reembed\n\n# apply (re-embeds every memory, then records the new model)\nMEMINI_EMBED_MODEL=new-model memini reembed --yes\n```\n\nRe-embedding keeps the store's dimensionality — switching dims (e.g. `1536` → `1024`) still\nrequires a fresh store (`memini export`, then `memini import` into a new one). Set\n`MEMINI_REEMBED_ON_MODEL_CHANGE=true` to re-embed automatically at startup instead of\nrefusing; it's off by default because re-embedding blocks startup and calls the embeddings\nendpoint once per memory.\n\n## Benchmarks\n\n```sh\nmise run bench   # offline retrieval benchmark (hybrid vs vector vs keyword)\n```\n\nFull results from a `bench/results/` run (written locally; gitignored), all on the same\nall-MiniLM-L6-v2 (384-d) endpoint, the model agentmemory benchmarks with. Cells are\n`recall_any@5 / @10 / MRR` (%); `p50` is in-process recall latency (rerank rows show the\ncost they add on top):\n\n| Strategy                                | LongMemEval · session  | LoCoMo · turn-level    | LoCoMo · session-level | p50         |\n| --------------------------------------- | ---------------------- | ---------------------- | ---------------------- | ----------- |\n| vector                                  | 92.6 / 95.4 / 80.7     | 41.3 / 51.8 / 28.1     | 64.1 / 79.8 / 45.2     | \u003c1 ms       |\n| keyword (Porter BM25)                   | 97.6 / 99.0 / 92.2     | 58.7 / 67.1 / 44.8     | 92.6 / 96.8 / 79.4     | ~3 ms       |\n| **hybrid** (default)                    | **98.4 / 99.2 / 93.0** | **59.7 / 69.9 / 42.4** | **90.9 / 96.6 / 74.3** | ~5 ms       |\n| + cross-encoder (`MEMINI_RERANK=\u003curl\u003e`) | 98.4 / 99.2 / 93.1     | **70.9 / 75.0 / 59.8** | 90.9 / 96.6 / 74.3     | +20–230 ms  |\n| + LLM rerank (`MEMINI_RERANK=llm`)      | 98.4 / 99.2 / 93.0     | **74.4 / 76.5 / 67.4** | —                      | +350–420 ms |\n\nQuestions: LongMemEval 500, LoCoMo turn 1,982, LoCoMo session 1,981 (rerank =\nQwen3-Reranker-0.6B cross-encoder, Qwen3.5-9B LLM). Hybrid never trails either single leg\non the saturated session sets; on turn-level LoCoMo (gold = exact evidence turns) base\nrecall has headroom, so reranking pays off (cross-encoder +11pp R@5 / +17pp MRR, LLM +15pp\n/ +25pp) while being a no-op once recall is already at ceiling.\n\nOn the same model, dataset, and metric, memini hybrid beats agentmemory's published\nLongMemEval-S numbers, and goes higher with a premium embedder:\n\n| System                    | Embedding          |       R@5 |      R@10 |\n| ------------------------- | ------------------ | --------: | --------: |\n| memini — hybrid           | all-MiniLM-L6-v2   | **98.4%** | **99.2%** |\n| memini — hybrid           | Qwen3-Embedding-8B | **98.8%** | **99.6%** |\n| agentmemory — BM25+Vector | all-MiniLM-L6-v2   |     95.2% |     98.6% |\n| agentmemory — BM25-only   | —                  |     86.2% |     94.6% |\n\nmemini's Porter-stemming keyword leg is +11pp over their BM25-only.\n\nThese numbers are on the full 500-question set, which is also where parameters were swept,\nso to check they aren't tuned-to-test the harness splits LongMemEval deterministically\ninto a 450-question tune set and a never-swept 50-question held set (`-holdout`). Hybrid\nscores 98.2% R@5 on tune and does not regress on held (100% R@5, 50q), so the tuning\nchoices generalize. The per-category headroom is concentrated in `single-session-preference`\n(88.9% R@5 on tune).\n\nFull per-leg/per-category tables, the split breakdown, parameter sweeps, methodology,\ncaveats, and the LoCoMo QA comparison (vs mem0/Letta) are in [`bench/`](bench/README.md).\n\n## License\n\n[AGPL-3.0](LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feleboucher%2Fmemini","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Feleboucher%2Fmemini","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feleboucher%2Fmemini/lists"}