https://github.com/ncmonx/icm-graph
Token-efficient context CLI for Claude Code, Cursor, Cline. Cuts AI coding costs 70-90% via context packs, output filters, local memory + receipts. 40 MCP tools, 122/122 tests, Apache-2.0.
https://github.com/ncmonx/icm-graph
ai-agents ai-coding anthropic claude-code cli cline context-engineering cpp17 cursor developer-tools knowledge-graph llm-tools local-first mcp mcp-server prompt-engineering semantic-search sqlite token-optimization
Last synced: 13 days ago
JSON representation
Token-efficient context CLI for Claude Code, Cursor, Cline. Cuts AI coding costs 70-90% via context packs, output filters, local memory + receipts. 40 MCP tools, 122/122 tests, Apache-2.0.
- Host: GitHub
- URL: https://github.com/ncmonx/icm-graph
- Owner: ncmonx
- License: apache-2.0
- Created: 2026-05-06T05:59:16.000Z (about 1 month ago)
- Default Branch: main
- Last Pushed: 2026-05-23T08:00:45.000Z (19 days ago)
- Last Synced: 2026-05-23T09:33:58.840Z (19 days ago)
- Topics: ai-agents, ai-coding, anthropic, claude-code, cli, cline, context-engineering, cpp17, cursor, developer-tools, knowledge-graph, llm-tools, local-first, mcp, mcp-server, prompt-engineering, semantic-search, sqlite, token-optimization
- Language: Python
- Homepage:
- Size: 3.67 MB
- Stars: 1
- Watchers: 0
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE
- Codeowners: .github/CODEOWNERS
- Security: SECURITY.md
- Notice: NOTICE
- Agents: AGENTS.md
Awesome Lists containing this project
README
# Icemage (`icmg`)
[](https://github.com/ncmonx/icm-graph/releases)
[](https://github.com/ncmonx/icm-graph/releases)
[](https://github.com/ncmonx/icm-graph/commits/main)
[](#)
[](#)
[](LICENSE)
[](https://securityscorecards.dev/viewer/?uri=github.com/ncmonx/icm-graph)
[](https://www.bestpractices.dev/projects/12818)
[](https://github.com/sponsors/ncmonx)
[](https://ko-fi.com/ncmonx)
> **Stop burning tokens. Stop losing context. Ship faster.**
A small helper app that makes AI coding assistants — Claude Code, Cursor, and friends — **70 – 98 % cheaper** to run, without making them less helpful.
**40 MCP tools · 1082/1082 tests · single-binary · 100 % local · pure-bash hooks** (zero Python/jq dependency).
If you've ever watched a huge token bill evaporate on a single file read, paid for "thinking" you didn't need, or re-explained your project to the AI for the fifth time today — Icemage is for you.
---
## 🟢 Why Icemage
AI assistants are powerful but **wasteful by default**. Every time the AI opens a file, runs a command, or starts a new chat, it re-reads context it has seen many times and dumps full output into the conversation. Icemage sits quietly in the background and trims the noise before it ever reaches the AI:
- **Long files** → only the relevant slice
- **Noisy command output** → just the parts that matter
- **Web pages** → cached + summarised
- **Past decisions** → remembered across sessions so the AI doesn't ask twice
- **Repeated work** → results reused instead of recomputed
The AI keeps its full intelligence. Your wallet keeps more of its money.
---
## 📊 Headline numbers
| Metric | Typical | Best | Since |
|---|---|---|---|
| File-read savings | 70 – 85 % fewer tokens | up to 92 % | v0.5 |
| Test / build output | 60 – 80 % shorter | up to 90 % | v0.5 |
| **Multi-file UI propagation** (style-clone) | **30 – 50× cheaper** | up to 98 % | v1.22.0 |
| **Cross-project bundle** (port) | **8 – 12× cheaper** | up to 95 % | v1.24.0 |
| **Compressed-Write** (AI emit diff) | **70 – 95% fewer tokens** | up to 98 % | v1.25.0 |
| Web-fetch reduction | 70 – 90 % smaller | up to 95 % | v0.4 |
| Repeat-context recall | near-zero, **< 5 ms cached** | — | v1.21.8 |
| Past-chat full-text search | **< 10 ms** across months | — | v1.21.7 |
| Graph symbol lookup | **256-slot in-RAM cache** | — | v1.21.8 |
| First-prompt warmup | < 1 s | — | v1.18 |
| **Cold build time** (icmg itself) | **~50 % faster** (20 min → 9-10 min) | — | v1.26.0 |
| **MCP response filter** (verbose plugins) | **50 – 80 % smaller** | up to 90 % | v1.30.0 |
| **Auto-thinking suppress** (trivial prompts) | **~1500 tok / call saved** | — | v1.30.0 |
| **Caveman-auto** (long-prose replies) | **60 – 75 % compress** | up to 85 % | v1.30.0 |
| **Service auto-start** (UserPromptSubmit) | **0-touch warm-up** | — | v1.30.0 |
| **Path ambiguity warning** (icmg context) | wrong-file lookups → loud | — | v1.29.0 |
| **rg-wrapper + brace glob** (icmg grep/files) | flag-mirror, **{a,b}** expand | — | v1.29.0 |
| **Local AI model** (built-in, opt-in) | **0 cloud calls** | privacy-first | v1.31.0 |
| **Smart router** (REGEX vs LLM_LOCAL vs CACHE) | **<100 us p99** | hot-path forced regex | v1.31.0 |
| **HTTP streaming download** (model fetch + SHA256) | **400 MB - 2 GB** safe-verify | tamper-detect | v1.31.0 |
| **icmg git** wrapper (single ergonomic entry) | **Tkil-filtered** + safety-gated | enforces icmg-FIRST | v1.31.0 |
| **Python-free core** (PRECOMPACT_PY dropped) | **-200-500 ms** boot saved | single-binary | v1.31.0 |
| **pack --rerank** (LLM-reorder memory hits) | **opt-in** warm-path | router-gated | v1.32.0 |
| **PreCompact LLM summary** (warm-pool Qwen 0.5B) | **<15 s** cold | regex fallback always | v1.32.0 |
| **icmg compact-bg** (proactive memory worker) | **<3 s** warm | manual + future hook | v1.32.0 |
| **Smarter local AI memory** | **multi-prompt safe** | no overflow | v1.32.0 |
| Cost per AI session | **down 70 – 90 %** vs. raw | up to 95 % | — |
Measured on real-world sessions. Your mileage will vary with project size and habits — anyone running a busy AI agent for a day already sees meaningful savings.
---
## ✨ What's new
> **Recent releases.** Older entries archived in [`CHANGELOG.md`](CHANGELOG.md).
- **v1.70.0** - **Fixes from user reports: cleaner CLI output and command pass-through**. `icmg run` no longer swallows flags meant for the program you're wrapping - `icmg run ./tool --json` now passes `--json` to the tool, and `icmg run -- ...` forwards everything after `--` verbatim. `icmg llm list` now prints a single clean JSON document (it used to tack on a trailing text line that broke JSON parsers). And `icmg recall --json` always returns valid UTF-8, so tools that re-serialize its output no longer choke on notes that captured stray binary bytes. Full automated suite passes 1082 of 1082 checks.
- **v1.69.0** - **The icmg-first rule now enforces itself**. Getting an AI assistant to reliably go through icmg (instead of raw file reads or shell) used to mean hand-seeding a rule into memory for every new project and every new teammate. Now `icmg init` installs a small hook that reminds the assistant of the rule on every prompt automatically - the full rule on the first turn, a one-liner after - so a fresh project or a new user just runs `icmg init` and it's handled (opt out with ICMG_NO_ICMG_FIRST=1). This release also fixes a crash where building the assistant's context could abort if a saved note contained non-text bytes. Full automated suite passes 1071 of 1071 checks.
- **v1.68.0** - **Security: a locked-down daemon and a secret scanner**. The optional background daemon now requires a per-user token before it will accept any command (even shutdown), so no other local program can quietly drive it. And a new "icmg scan" command walks your project and flags hardcoded secrets - API keys, access tokens and the like - showing where each one is (redacted by default) and failing if any are found, so it drops straight into a pre-commit or CI check. Full automated suite passes 1068 of 1068 checks.
- **v1.67.0** - **Fix: clean new-project setup**. Initializing icmg in a brand-new project could hit a 'duplicate column' error if a background task touched the just-created database at the same moment. The database setup now locks and re-checks before applying each step, so concurrent access is safe and a new project initializes cleanly. Full automated suite passes 1060 of 1060 checks.
- **v1.66.0** - **Per-project terse mode**. The compact 'caveman' response mode is now decided per project instead of one global switch - handy when using icmg as a dedicated brain for a project that should behave differently from the rest. `icmg caveman on/off` targets the current project by default (add `--global` for all); a project can even stay off while the global setting is on. Full automated suite passes 1057 of 1057 checks.
## 🚀 Quick start
1. **Download** the latest installer from the [Releases page](https://github.com/ncmonx/icm-graph/releases) — `icmg--win-x64.zip` for Windows, `icmg--linux-x64.tar.gz` for Linux.
2. **Extract** the archive into any folder of your choice.
3. **Add the folder to your `PATH`** so the `icmg` command is available everywhere.
4. **Open your project** in a terminal and run:
```text
icmg init
```
That's it. The next time you launch Claude Code (or Cursor / Cline / Windsurf — see below), Icemage will quietly start trimming tokens.
---
## 🧰 What you'll actually use day-to-day
After install, the only command most people type is `icmg init` once per project. Everything else happens automatically. A few useful commands when you want to peek under the hood:
| Want to | Type |
|---|---|
| See how much you saved this month | `icmg savings` |
| See a chart in the terminal | `icmg savings --ascii` |
| Recall a past decision in this project | `icmg recall ""` |
| Recall something from another project | `icmg cross-recall ""` |
| Wake-up briefing for a fresh session | `icmg wake-up` |
| Update Icemage in place | `icmg update --apply` |
| Health-check the install | `icmg doctor` |
For the full menu run `icmg --help`.
---
## 🤖 Works with
- **Claude Code** (primary target — best-tested)
- **Cursor** — drop-in via the same hooks
- **Cline**, **Windsurf**, **OpenCode** — same approach, may need a small config nudge
- **Anything that exposes hooks or MCP** — the MCP server bundled with Icemage is reusable
---
## 🛡️ Safety + privacy
- **100 % local.** Everything Icemage knows about your projects lives in a small SQLite database next to your code. Nothing is sent to a remote server — not the project name, not the file paths, not the recalled snippets.
- **No telemetry.** Icemage doesn't phone home.
- **Open source.** [Apache-2.0](LICENSE). Audit the binary, the release notes, and the file structure freely. Source code is held privately to keep the bug surface manageable for a solo maintainer — public reports + private fixes is the operating model.
- **Tamper-evident.** Every release ships with a `sha256` sidecar so you can verify the binary you downloaded.
---
## 🩹 Honest limits
- **Windows + Linux only** for prebuilt binaries today. macOS users currently need to wait for a self-hosted runner build (planned).
- **First-time install on Windows with strict antivirus** can be slow until you let Icemage run once. After that it's fast.
- **Not a replacement for the AI.** Icemage is a token-trimming layer — it doesn't write code for you and it doesn't make a bad AI smart.
---
## 💖 Support
If Icemage saved you a few hours or a few dollars and you want to send a small thank-you, both routes work:
- [GitHub Sponsors](https://github.com/sponsors/ncmonx)
- [Ko-fi tip jar](https://ko-fi.com/ncmonx)
All revenue goes straight into more releases — there is no team behind this, just one maintainer and a long backlog of "make AI agents less wasteful" ideas.
---
## ❓ FAQ
**Does Icemage send my code anywhere?**
No. Everything is local. The only network call is when you ask Icemage to update itself or fetch a URL through `icmg fetch`.
**Can my company use it?**
Yes — Apache-2.0 licensed, free for any use including commercial. If you want a private support arrangement or a custom build, [open a sponsorship](https://github.com/sponsors/ncmonx).
**Why is the source code repo private?**
One maintainer, no security team. Public bug reports + private fixes lets me ship hotfixes the same day without telegraphing exploitable details. The release binaries and reproducible build hash are still public.
**Does it slow my AI down?**
No. Trimming happens *before* the AI reads anything, so the AI sees a smaller, cleaner version of the same context. End-to-end interactions get faster, not slower.
**Where are the savings stored?**
In `.icmg/data.db` inside each project (small SQLite file). Run `icmg savings` to see the breakdown.
**How do I report a bug or ask for a feature?**
Open an issue at the [GitHub issues](https://github.com/ncmonx/icm-graph/issues) page. Real-world reproductions with `icmg savings --json` attached get triaged fastest.
---
## 🌟 Star history
---
## 📜 License
[Apache-2.0](LICENSE).
---
## 📚 Other docs
- [CHANGELOG.md](CHANGELOG.md) — full version history
- [SECURITY.md](SECURITY.md) — vulnerability reporting
- [NOTICE](NOTICE) — third-party attributions