{"id":48667293,"url":"https://github.com/memtomem/memtomem-stm","last_synced_at":"2026-05-02T03:09:37.550Z","repository":{"id":350349119,"uuid":"1205834997","full_name":"memtomem/memtomem-stm","owner":"memtomem","description":"Short-term memory proxy gateway with proactive memory surfacing for AI agents","archived":false,"fork":false,"pushed_at":"2026-04-24T15:21:31.000Z","size":1998,"stargazers_count":0,"open_issues_count":6,"forks_count":5,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-24T16:46:08.479Z","etag":null,"topics":["ai-agents","caching","claude","compression","llm","mcp","mcp-proxy","mcp-server","memory","proxy","python","short-term-memory"],"latest_commit_sha":null,"homepage":"https://memtomem.com","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/memtomem.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":"CLA.md"}},"created_at":"2026-04-09T10:21:59.000Z","updated_at":"2026-04-24T14:53:59.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/memtomem/memtomem-stm","commit_stats":null,"previous_names":["memtomem/memtomem-stm"],"tags_count":20,"template":false,"template_full_name":null,"purl":"pkg:github/memtomem/memtomem-stm","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/memtomem%2Fmemtomem-stm","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/memtomem%2Fmemtomem-stm/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/memtomem%2Fmemtomem-stm/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/memtomem%2Fmemtomem-stm/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/memtomem","download_url":"https://codeload.github.com/memtomem/memtomem-stm/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/memtomem%2Fmemtomem-stm/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32297900,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-26T09:34:17.070Z","status":"ssl_error","status_checked_at":"2026-04-26T09:34:00.993Z","response_time":129,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agents","caching","claude","compression","llm","mcp","mcp-proxy","mcp-server","memory","proxy","python","short-term-memory"],"created_at":"2026-04-10T11:08:08.984Z","updated_at":"2026-04-26T13:02:18.192Z","avatar_url":"https://github.com/memtomem.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# memtomem-stm\n\n**Official website \u0026 docs: [https://memtomem.com](https://memtomem.com)**\n\n[![PyPI](https://img.shields.io/pypi/v/memtomem-stm)](https://pypi.org/project/memtomem-stm/)\n[![Python 3.12+](https://img.shields.io/badge/python-3.12+-green)](https://python.org)\n[![License: Apache 2.0](https://img.shields.io/badge/license-Apache%202.0-blue)](LICENSE)\n[![CLA](https://img.shields.io/badge/CLA-required-green)](CLA.md)\n\n\u003e 🚧 **Alpha** — APIs and defaults may change between 0.1.x releases. Feedback and issue reports are especially welcome: [Issues](https://github.com/memtomem/memtomem-stm/issues) · [Discussions](https://github.com/memtomem/memtomem-stm/discussions).\n\nSpend fewer tokens. Remember more. Ship faster.\n\nmemtomem-stm is an MCP proxy that typically **cuts token usage by 20–80%** and gives your agent **memory across sessions** — with no changes to your upstream MCP servers.\n\nIt sits between your AI agent and its upstream MCP servers, compressing bloated tool responses, caching repeated calls, and automatically surfacing relevant context from prior sessions via a memtomem LTM server.\n\n**You need this if:**\n- Your agent **burns tokens** re-reading the same files and search results — STM compresses and caches them (Claude Code, Cursor, Claude Desktop, or any MCP client)\n- Your coding sessions **lose context** and the agent re-discovers decisions it already made — STM surfaces prior context automatically via memtomem LTM\n- You run custom MCP servers and want **compression, caching, and observability** without changing upstream code — STM is a drop-in proxy layer\n\n```mermaid\nflowchart TB\n    Agent[\"Agent\u003cbr/\u003e(Claude Code, Cursor, …)\"]\n    subgraph STM[\"memtomem-stm (STM)\"]\n        Pipe[\"CLEAN → COMPRESS → SURFACE → INDEX\"]\n    end\n    LTM[(\"memtomem LTM\u003cbr/\u003e(MCP server)\")]\n    FS[\"filesystem\u003cbr/\u003eMCP server\"]\n    GH[\"github\u003cbr/\u003eMCP server\"]\n    Other[\"…any MCP server\"]\n\n    Agent --\u003e|MCP| STM\n    STM \u003c--\u003e|MCP: stdio / SSE / HTTP| FS\n    STM \u003c--\u003e|MCP| GH\n    STM \u003c--\u003e|MCP| Other\n    STM \u003c-.-\u003e|surfacing\u003cbr/\u003evia MCP| LTM\n```\n\n## Installation\n\n```bash\npip install memtomem-stm\n```\n\nOr with [uv](https://docs.astral.sh/uv/):\n\n```bash\nuv tool install memtomem-stm     # install mms / memtomem-stm as global CLI tools\nuvx memtomem-stm --help          # or run without installing\nuv pip install memtomem-stm      # or install into the active environment\n```\n\nmemtomem-stm is **independent**: it has no Python-level dependency on memtomem core. To enable proactive memory surfacing, point STM at a running memtomem MCP server (or any compatible MCP server) — communication happens entirely through the MCP protocol.\n\n## Quick Start\n\n`mms` is the short alias for `memtomem-stm-proxy` — both commands are identical, use whichever you prefer.\n\n### 1. Add an upstream MCP server\n\nFor first-time setup, run the guided wizard — it prompts for name/prefix/command, optionally probes the server, and then offers to register STM with Claude Code (or generate `.mcp.json`) in the same flow:\n\n```bash\nmms init\n```\n\nOr add servers non-interactively:\n\n```bash\nmms add filesystem \\\n  --command npx \\\n  --args \"-y @modelcontextprotocol/server-filesystem /home/user/projects\" \\\n  --prefix fs\n```\n\n`--prefix` is required: it's the namespace under which the upstream server's tools will appear (e.g. `fs__read_file`). Repeat for each MCP server you want to proxy.\n\nIf you've already configured MCP servers in Claude Desktop, Claude Code, or a project `.mcp.json`, `mms add --import` (alias `--from-clients`) reuses the init wizard to bulk-select them — skipping anything already registered.\n\n```bash\nmms list      # show what you've added\nmms status    # show full config + connectivity\n```\n\n### 2. Connect your AI client to STM\n\n`mms init` ends with a 3-way prompt — pick option 1 and it shells out to `claude mcp add` for you. If you skipped that step or want to register with a different client later, run:\n\n```bash\nmms register\n```\n\nTo register manually, use `claude` directly:\n\n```bash\nclaude mcp add memtomem-stm -s user -- memtomem-stm\n```\n\nOr add it to a JSON MCP config for Cursor / Windsurf / Claude Desktop / Gemini:\n\n```json\n{\n  \"mcpServers\": {\n    \"memtomem-stm\": {\n      \"command\": \"memtomem-stm\"\n    }\n  }\n}\n```\n\n### 3. Use the proxied tools\n\nYour agent now sees proxied tools (`fs__read_file`, `gh__search_repositories`, etc.). Every call goes through the 4-stage pipeline automatically — responses are cleaned, compressed, cached, and (when an LTM server is configured) enriched with relevant memories.\n\nTo check what's happening, ask the agent to call `stm_proxy_stats`.\n\n## Tutorial notebooks\n\n\u003e **Try it without wiring into your AI client first.** A [quickstart Jupyter notebook](notebooks/01_quickstart_proxy_setup.ipynb) registers an upstream MCP server, calls a proxied tool, and reads `stm_proxy_stats` end-to-end. Clone the repo, `uv sync`, and `uv run jupyter lab notebooks/` — no external services needed.\n\n## Key Features\n\n- 🗜️ **Typically 20–80% fewer tokens per tool call** — 10 compression strategies with auto-selection by content type, query-aware budget, and zero-loss progressive delivery → [docs/compression.md](https://github.com/memtomem/memtomem-stm/blob/main/docs/compression.md)\n- 🧠 **Your agent remembers** — proactive memory surfacing from prior sessions, gated by relevance threshold, rate limit, dedup, and circuit breaker → [docs/surfacing.md](https://github.com/memtomem/memtomem-stm/blob/main/docs/surfacing.md)\n- 💾 **Repeated calls are free** — response cache with TTL and eviction; surfacing re-applied on cache hit so injected memories stay fresh → [docs/caching.md](https://github.com/memtomem/memtomem-stm/blob/main/docs/caching.md)\n- 🛡️ **Production-safe** — circuit breaker, retry with backoff, write-tool skip, query cooldown, dedup, sensitive content auto-detection, Langfuse tracing, horizontal scaling via `PendingStore`\n\n## Documentation\n\n| Guide | Topic |\n|-------|-------|\n| [Surfacing](https://github.com/memtomem/memtomem-stm/blob/main/docs/surfacing.md) | How agents recall prior context automatically |\n| [Compression](https://github.com/memtomem/memtomem-stm/blob/main/docs/compression.md) | All 10 strategies — pick the right one for your content |\n| [Caching](https://github.com/memtomem/memtomem-stm/blob/main/docs/caching.md) | Skip repeated work with response caching |\n| [Configuration](https://github.com/memtomem/memtomem-stm/blob/main/docs/configuration.md) | Tune settings without touching code |\n| [CLI](https://github.com/memtomem/memtomem-stm/blob/main/docs/cli.md) | CLI commands and the 11 MCP tools |\n\n## Development\n\n```bash\nuv sync                                                    # install dev deps\nuv run pytest -m \"not ollama and not bench_qa_meta and not bench_qa_llm_judge\"   # tests (CI filter)\nuv run ruff check src \u0026\u0026 uv run ruff format --check src    # lint (required)\nuv run mypy src                                            # typecheck (advisory)\n```\n\nCI runs the same commands on every PR via `.github/workflows/ci.yml`. Lint (`ruff check` + `ruff format --check`) and tests must pass; mypy is advisory.\n\n## License\n\n[Apache License 2.0](LICENSE). Contributions are accepted under the terms of the [Contributor License Agreement](CLA.md).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmemtomem%2Fmemtomem-stm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmemtomem%2Fmemtomem-stm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmemtomem%2Fmemtomem-stm/lists"}