An open API service indexing awesome lists of open source software.

https://github.com/chopratejas/headroom

The Context Optimization Layer for LLM Applications
https://github.com/chopratejas/headroom

agent ai anthropic compression context-engineering context-window fastapi langchain llm mcp openai proxy python rag token-optimization

Last synced: about 2 months ago
JSON representation

The Context Optimization Layer for LLM Applications

Awesome Lists containing this project

README

          

```
โ–ˆโ–ˆโ•— โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ•—
โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•”โ•โ•โ•โ•โ•โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•”โ•โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•”โ•โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ–ˆโ•‘
โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•— โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•”โ–ˆโ–ˆโ–ˆโ–ˆโ•”โ–ˆโ–ˆโ•‘
โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•”โ•โ•โ• โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•”โ•โ•โ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ•‘โ•šโ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ•‘
โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•—โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•‘โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ•‘ โ–ˆโ–ˆโ•‘โ•šโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ•โ•šโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ•”โ•โ–ˆโ–ˆโ•‘ โ•šโ•โ• โ–ˆโ–ˆโ•‘
โ•šโ•โ• โ•šโ•โ•โ•šโ•โ•โ•โ•โ•โ•โ•โ•šโ•โ• โ•šโ•โ•โ•šโ•โ•โ•โ•โ•โ• โ•šโ•โ• โ•šโ•โ• โ•šโ•โ•โ•โ•โ•โ• โ•šโ•โ•โ•โ•โ•โ• โ•šโ•โ• โ•šโ•โ•
The context compression layer for AI agents
```

60โ€“95% fewer tokens ยท library ยท proxy ยท MCP ยท 6 algorithms ยท local-first ยท reversible


CI
codecov
PyPI
npm
Model: Kompress-base
Tokens saved: 60B+
License: Apache 2.0
Docs


Docs ยท
Install ยท
Proof ยท
Agents ยท
Discord ยท
llms.txt


AI agents / LLMs: read /llms.txt here, or fetch the live index / full docs blob.

---

> Headroom compresses everything your AI agent reads โ€” tool outputs, logs, RAG chunks, files, and conversation history โ€” before it reaches the LLM. Same answers, fraction of the tokens.


Headroom in action

Live: 10,144 โ†’ 1,260 tokens โ€” same FATAL found.

## What it does

- **Library** โ€” `compress(messages)` in Python or TypeScript, inline in any app
- **Proxy** โ€” `headroom proxy --port 8787`, zero code changes, any language
- **Agent wrap** โ€” `headroom wrap claude|codex|cursor|aider|copilot` in one command
- **MCP server** โ€” `headroom_compress`, `headroom_retrieve`, `headroom_stats` for any MCP client
- **Cross-agent memory** โ€” shared store across Claude, Codex, Gemini, auto-dedup
- **`headroom learn`** โ€” mines failed sessions, writes corrections to `CLAUDE.md` / `AGENTS.md`
- **Reversible (CCR)** โ€” originals never deleted; LLM retrieves on demand

## How it works (30 seconds)

```
Your agent / app
(Claude Code, Cursor, Codex, LangChain, Agno, Strands, your own codeโ€ฆ)
โ”‚ prompts ยท tool outputs ยท logs ยท RAG results ยท files
โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Headroom (runs locally โ€” your data stays here) โ”‚
โ”‚ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ โ”‚
โ”‚ CacheAligner โ†’ ContentRouter โ†’ CCR โ”‚
โ”‚ โ”œโ”€ SmartCrusher (JSON) โ”‚
โ”‚ โ”œโ”€ CodeCompressor (AST) โ”‚
โ”‚ โ””โ”€ Kompress-base (text, HF) โ”‚
โ”‚ โ”‚
โ”‚ Cross-agent memory ยท headroom learn ยท MCP โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚ compressed prompt + retrieval tool
โ–ผ
LLM provider (Anthropic ยท OpenAI ยท Bedrock ยท โ€ฆ)
```

- **ContentRouter** โ€” detects content type, selects the right compressor
- **SmartCrusher / CodeCompressor / Kompress-base** โ€” compress JSON, AST, or prose
- **CacheAligner** โ€” stabilizes prefixes so provider KV caches actually hit
- **CCR** โ€” stores originals locally; LLM calls `headroom_retrieve` if it needs them

โ†’ [Architecture](https://headroom-docs.vercel.app/docs/architecture) ยท [CCR reversible compression](https://headroom-docs.vercel.app/docs/ccr) ยท [Kompress-base model card](https://huggingface.co/chopratejas/kompress-base)

## Get started (60 seconds)

```bash
# 1 โ€” Install
pip install "headroom-ai[all]" # Python
npm install headroom-ai # Node / TypeScript

# 2 โ€” Pick your mode
headroom wrap claude # wrap a coding agent
headroom proxy --port 8787 # drop-in proxy, zero code changes
# or: from headroom import compress # inline library

# 3 โ€” See the savings
headroom stats
```

Granular extras: `[proxy]`, `[mcp]`, `[ml]`, `[agno]`, `[langchain]`, `[evals]`. Requires **Python 3.10+**.

## Proof

**Savings on real agent workloads:**

| Workload | Before | After | Savings |
|-------------------------------|-------:|-------:|--------:|
| Code search (100 results) | 17,765 | 1,408 | **92%** |
| SRE incident debugging | 65,694 | 5,118 | **92%** |
| GitHub issue triage | 54,174 | 14,761 | **73%** |
| Codebase exploration | 78,502 | 41,254 | **47%** |

**Accuracy preserved on standard benchmarks:**

| Benchmark | Category | N | Baseline | Headroom | Delta |
|------------|----------|----:|---------:|---------:|------------|
| GSM8K | Math | 100 | 0.870 | 0.870 | **ยฑ0.000** |
| TruthfulQA | Factual | 100 | 0.530 | 0.560 | **+0.030** |
| SQuAD v2 | QA | 100 | โ€” | **97%** | 19% compression |
| BFCL | Tools | 100 | โ€” | **97%** | 32% compression |

Reproduce: `python -m headroom.evals suite --tier 1` ยท [Full benchmarks & methodology](https://headroom-docs.vercel.app/docs/benchmarks)



60B+ tokens saved โ€” community leaderboard


60B+ tokens saved by the community โ€” live leaderboard โ†’

## Agent compatibility matrix

| Agent | `headroom wrap` | Notes |
|-------------|:---------------:|----------------------------------|
| Claude Code | โ— | `--memory` ยท `--code-graph` |
| Codex | โ— | shares memory with Claude |
| Cursor | โ— | prints config โ€” paste once |
| Aider | โ— | starts proxy + launches |
| Copilot CLI | โ— | starts proxy + launches |
| OpenClaw | โ— | installs as ContextEngine plugin |

Any OpenAI-compatible client works via `headroom proxy`. MCP-native: `headroom mcp install`.

## When to use ยท When to skip

**Great fit if youโ€ฆ**
- run AI coding agents daily and want savings without changing your code
- work across multiple agents and want shared memory
- need reversible compression โ€” originals always retrievable via CCR

**Skip it if youโ€ฆ**
- only use a single provider's native compaction and don't need cross-agent memory
- work in a sandboxed environment where local processes can't run

Integrations โ€” drop Headroom into any stack

| Your setup | Hook in with |
|------------------------|------------------------------------------------------------------|
| Any Python app | `compress(messages, model=โ€ฆ)` |
| Any TypeScript app | `await compress(messages, { model })` |
| Anthropic / OpenAI SDK | `withHeadroom(new Anthropic())` ยท `withHeadroom(new OpenAI())` |
| Vercel AI SDK | `wrapLanguageModel({ model, middleware: headroomMiddleware() })` |
| LiteLLM | `litellm.callbacks = [HeadroomCallback()]` |
| LangChain | `HeadroomChatModel(your_llm)` |
| Agno | `HeadroomAgnoModel(your_model)` |
| Strands | [Strands guide](https://headroom-docs.vercel.app/docs/strands) |
| ASGI apps | `app.add_middleware(CompressionMiddleware)` |
| Multi-agent | `SharedContext().put / .get` |
| MCP clients | `headroom mcp install` |

What's inside

- **SmartCrusher** โ€” universal JSON: arrays of dicts, nested objects, mixed types.
- **CodeCompressor** โ€” AST-aware for Python, JS, Go, Rust, Java, C++.
- **Kompress-base** โ€” our HuggingFace model, trained on agentic traces.
- **Image compression** โ€” 40โ€“90% reduction via trained ML router.
- **CacheAligner** โ€” stabilizes prefixes so Anthropic/OpenAI KV caches actually hit.
- **IntelligentContext** โ€” score-based context fitting with learned importance.
- **CCR** โ€” reversible compression; LLM retrieves originals on demand.
- **Cross-agent memory** โ€” shared store, agent provenance, auto-dedup.
- **SharedContext** โ€” compressed context passing across multi-agent workflows.
- **`headroom learn`** โ€” plugin-based failure mining for Claude, Codex, Gemini.

Pipeline internals

Headroom exposes one stable request lifecycle across `compress()`, the SDK, and the proxy:

`Setup` โ†’ `Pre-Start` โ†’ `Post-Start` โ†’ `Input Received` โ†’ `Input Cached` โ†’ `Input Routed` โ†’ `Input Compressed` โ†’ `Input Remembered` โ†’ `Pre-Send` โ†’ `Post-Send` โ†’ `Response Received`

- **Transforms** do the work: CacheAligner, ContentRouter, SmartCrusher, CodeCompressor, Kompress-base, IntelligentContext / RollingWindow.
- **Pipeline extensions** observe or customize lifecycle stages via `on_pipeline_event(...)`.
- **Compression hooks** sit alongside the canonical lifecycle as an additional extension seam.
- **Proxy extensions** remain the server/app integration seam for ASGI middleware, routes, and startup policy.

Provider and tool-specific behavior lives under `headroom/providers/` so core orchestration stays focused on lifecycle, sequencing, and policy.

- **CLI/tool slices**: `headroom/providers/claude`, `copilot`, `codex`, `openclaw`
- **Provider runtime slices**: `headroom/providers/claude`, `gemini`, plus shared backend/runtime dispatch in `headroom/providers/registry.py`
- **Core files stay orchestration-first**: `wrap.py`, `client.py`, `cli/proxy.py`, and `proxy/server.py` delegate provider-specific env shaping, API target normalization, backend selection, and transport dispatch.

## Install

```bash
pip install "headroom-ai[all]" # Python, everything
npm install headroom-ai # TypeScript / Node
docker pull ghcr.io/chopratejas/headroom:latest
```

Granular extras: `[proxy]`, `[mcp]`, `[ml]` (Kompress-base), `[agno]`, `[langchain]`, `[evals]`. Requires **Python 3.10+**.

Using `pipx`? Choose a supported interpreter explicitly:

```bash
pipx install --python python3.13 "headroom-ai[all]"
```

โ†’ [Installation guide](https://headroom-docs.vercel.app/docs/installation) โ€” Docker tags, persistent service, PowerShell, devcontainers.

## headroom learn


headroom learn in action

`headroom learn` โ€” mines failed sessions, writes corrections to `CLAUDE.md` / `AGENTS.md` / `GEMINI.md`.

## Documentation

| Start here | Go deeper |
|-------------------------------------------------------------------------------|------------------------------------------------------------------------------------|
| [Quickstart](https://headroom-docs.vercel.app/docs/quickstart) | [Architecture](https://headroom-docs.vercel.app/docs/architecture) |
| [Proxy](https://headroom-docs.vercel.app/docs/proxy) | [How compression works](https://headroom-docs.vercel.app/docs/how-compression-works) |
| [MCP tools](https://headroom-docs.vercel.app/docs/mcp) | [CCR โ€” reversible compression](https://headroom-docs.vercel.app/docs/ccr) |
| [Memory](https://headroom-docs.vercel.app/docs/memory) | [Cache optimization](https://headroom-docs.vercel.app/docs/cache-optimization) |
| [Failure learning](https://headroom-docs.vercel.app/docs/failure-learning) | [Benchmarks](https://headroom-docs.vercel.app/docs/benchmarks) |
| [Configuration](https://headroom-docs.vercel.app/docs/configuration) | [Limitations](https://headroom-docs.vercel.app/docs/limitations) |

## Compared to

Headroom runs **locally**, covers **every** content type, works with every major framework, and is **reversible**.

| | Scope | Deploy | Local | Reversible |
|------------------------------------------------------------------------------|------------------------------------------------|------------------------------------|:-----:|:----------:|
| **Headroom** | All context โ€” tools, RAG, logs, files, history | Proxy ยท library ยท middleware ยท MCP | Yes | Yes |
| [RTK](https://github.com/rtk-ai/rtk) | CLI command outputs | CLI wrapper | Yes | No |
| [lean-ctx](https://github.com/yvgude/lean-ctx) | CLI commands, MCP tools, editor rules | CLI wrapper ยท MCP | Yes | No |
| [Compresr](https://compresr.ai), [Token Co.](https://thetokencompany.ai) | Text sent to their API | Hosted API call | No | No |
| OpenAI Compaction | Conversation history | Provider-native | No | No |

> **Attribution.** Headroom ships with the excellent [RTK](https://github.com/rtk-ai/rtk) binary for shell-output rewriting โ€” `git show --short`, scoped `ls`, summarized installers. Huge thanks to the RTK team; their tool is a first-class part of our stack, and Headroom compresses everything downstream of it. Headroom can also use [lean-ctx](https://github.com/yvgude/lean-ctx) as the selected CLI context tool; set `HEADROOM_CONTEXT_TOOL=lean-ctx` before running `headroom wrap ...`.

## Contributing

```bash
git clone https://github.com/chopratejas/headroom.git && cd headroom
pip install -e ".[dev]" && pytest
```

Devcontainers in `.devcontainer/` (default + `memory-stack` with Qdrant & Neo4j). See [CONTRIBUTING.md](CONTRIBUTING.md).

## Community

- **[Live leaderboard](https://headroomlabs.ai/dashboard)** โ€” 60B+ tokens saved and counting.
- **[Discord](https://discord.gg/yRmaUNpsPJ)** โ€” questions, feedback, war stories.
- **[Kompress-base on HuggingFace](https://huggingface.co/chopratejas/kompress-base)** โ€” the model behind our text compression.

## License

Apache 2.0 โ€” see [LICENSE](LICENSE).