https://github.com/open-compress/claw-compactor
๐ฆ LLM Token Compression & Reduction Tool โ Cut AI agent token costs by up to 97%. 6-layer deterministic context compression for AI agent workspaces. No LLM required. Prompt compression, context window optimization & cost reduction for any LLM pipeline.
https://github.com/open-compress/claw-compactor
ai-agent-tools ai-cost-saving ai-infrastructure claw-compactor context-compression context-pruning context-window-optimization developer-tools llm-compression llm-context-compression llm-cost-reduction llm-token-compression llm-tools openclaw prompt-compression python-tools token-compression token-optimization token-reduction token-saving
Last synced: 4 days ago
JSON representation
๐ฆ LLM Token Compression & Reduction Tool โ Cut AI agent token costs by up to 97%. 6-layer deterministic context compression for AI agent workspaces. No LLM required. Prompt compression, context window optimization & cost reduction for any LLM pipeline.
- Host: GitHub
- URL: https://github.com/open-compress/claw-compactor
- Owner: open-compress
- License: mit
- Created: 2026-02-10T00:27:08.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2026-03-18T06:58:37.000Z (18 days ago)
- Last Synced: 2026-03-18T07:48:40.744Z (18 days ago)
- Topics: ai-agent-tools, ai-cost-saving, ai-infrastructure, claw-compactor, context-compression, context-pruning, context-window-optimization, developer-tools, llm-compression, llm-context-compression, llm-cost-reduction, llm-token-compression, llm-tools, openclaw, prompt-compression, python-tools, token-compression, token-optimization, token-reduction, token-saving
- Language: Python
- Homepage: https://www.opencompress.ai/
- Size: 1.49 MB
- Stars: 1,616
- Watchers: 94
- Forks: 138
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Citation: CITATION.cff
- Security: SECURITY.md
Awesome Lists containing this project
README
# Claw Compactor
### 14-Stage Fusion Pipeline for LLM Token Compression

[](https://github.com/open-compress/claw-compactor/actions)
[](https://codecov.io/gh/open-compress/claw-compactor)
[](https://python.org)
[](LICENSE)
[](https://pypi.org/project/claw-compactor/)
[](https://pypi.org/project/claw-compactor/)
[](https://github.com/open-compress/claw-compactor)
**15โ82% compression depending on content ยท Zero LLM inference cost ยท Reversible ยท 1600+ tests**
[Documentation](https://open-compress.github.io/claw-compactor) ยท [Architecture](ARCHITECTURE.md) ยท [Benchmarks](#benchmarks) ยท [Quick Start](#quick-start) ยท [API](#api)
---
## What is Claw Compactor?
Claw Compactor is an open-source **LLM token compression engine** built around a 14-stage **Fusion Pipeline**. Each stage is a specialized compressor โ from AST-aware code analysis to JSON statistical sampling to simhash-based deduplication โ chained through an immutable data flow architecture where each stage's output feeds the next.
### Demo
```
$ claw-compactor benchmark ./my-workspace
Claw Compactor v7.0 โ Fusion Pipeline Benchmark
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Scanning workspace... 47 files, 234,891 tokens
Stage Results:
โโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโฌโโโโโโโโโโโโฌโโโโโโโโโโโ
โ Stage โ Applied โ Reduction โ Time โ
โโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโผโโโโโโโโโโโโผโโโโโโโโโโโค
โ Cortex โ 47/47 โ โ โ 12ms โ
โ Photon โ 3/47 โ 2.1% โ 4ms โ
โ RLE โ 41/47 โ 8.3% โ 6ms โ
โ SemanticDedup โ 47/47 โ 12.7% โ 18ms โ
โ Ionizer โ 8/47 โ 71.2% โ 9ms โ
โ Neurosyntax โ 23/47 โ 18.4% โ 31ms โ
โ TokenOpt โ 47/47 โ 4.1% โ 3ms โ
โ Abbrev โ 12/47 โ 6.8% โ 5ms โ
โโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโดโโโโโโโโโโโโดโโโโโโโโโโโ
Summary:
Before: 234,891 tokens ($2.35 at GPT-4 rates)
After: 108,250 tokens ($1.08)
Saved: 126,641 tokens (53.9%) โ $1.27/run
Time: 88ms total
Estimated monthly savings at 100 runs/day: $3,810
```
---
## How It Compares
| Feature | Claw Compactor | LLMLingua-2 | SelectiveContext | gzip + base64 |
|:--------|:-:|:-:|:-:|:-:|
| Compression rate | 15โ82% | 30โ70% | 10โ40% | 60โ80% |
| ROUGE-L @ 0.3 | **0.653** | 0.346 | ~0.4 | N/A |
| ROUGE-L @ 0.5 | **0.723** | 0.570 | ~0.6 | N/A |
| LLM inference cost | **$0** | ~$0.02/call | **$0** | **$0** |
| Latency | **<50ms** | ~300ms | ~200ms | <10ms |
| Reversible | **Yes** | No | No | Yes (manual) |
| Content-aware routing | **14 stages** | 1 (perplexity) | 1 (self-info) | None |
| AST-aware code handling | **Yes** (tree-sitter) | No | No | No |
| JSON schema sampling | **Yes** | No | No | No |
| Log/diff/search stages | **Yes** | No | No | No |
| Required dependencies | **0** | torch, transformers | torch | zlib |
| LLM-readable output | **Yes** | Partial | Partial | **No** |
**Why Claw Compactor wins:** LLMLingua-2 drops tokens by perplexity score โ effective for natural language, but destroys code identifiers, JSON keys, and log patterns. Claw Compactor uses content-type-aware stages that understand the structure of what they're compressing.
---
```
Input
|
v
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ FUSION PIPELINE โ
โ โ
โ QuantumLock โ> Cortex โ> Photon โ> RLE โ> SemanticDedup โ> Ionizer โ
โ | | | | | | โ
โ KV-cache auto-detect base64 path simhash JSON โ
โ alignment 16 languages strip shorten dedup sampling โ
โ โ
โ โ> LogCrunch โ> SearchCrunch โ> DiffCrunch โ> StructuralCollapse โ
โ | | | | โ
โ log folding result dedup context fold import merge โ
โ โ
โ โ> Neurosyntax โ> Nexus โ> TokenOpt โ> Abbrev โโโโโโโโโ> Output โ
โ | | | | โ
โ AST compress ML token format NL shorten โ
โ (tree-sitter) classify optimize (text only) โ
โ โ
โ [ RewindStore ] โโ hash-addressed LRU for reversible retrieval โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
```
Key design principles:
- **Immutable data flow** โ `FusionContext` is a frozen dataclass. Every stage produces a new `FusionResult`; nothing is mutated in-place.
- **Gate-before-compress** โ Each stage has `should_apply()` that inspects context type, language, and role before doing any work. Stages that don't apply are skipped at zero cost.
- **Content-aware routing** โ Cortex auto-detects content type (code, JSON, logs, diffs, search results) and language (Python, Go, Rust, TypeScript, etc.), then downstream stages make type-aware compression decisions.
- **Reversible compression** โ Ionizer stores originals in a hash-addressed `RewindStore`. The LLM can call a tool to retrieve any compressed section by its marker ID.
---
## Benchmarks
### Real-World Compression (FusionEngine v7 vs Legacy Regex)
| Content Type | Legacy | FusionEngine | Improvement |
|:-------------|-------:|-------------:|:-----------:|
| Python source | 7.3% | **25.0%** | 3.4x |
| JSON (100 items) | 12.6% | **81.9%** | 6.5x |
| Build logs | 5.5% | **24.1%** | 4.4x |
| Agent conversation | 5.7% | **31.0%** | 5.4x |
| Git diff | 6.2% | **15.0%** | 2.4x |
| Search results | 5.3% | **40.7%** | 7.7x |
| **Weighted average** | **9.2%** | **36.3%** | **3.9x** |
### SWE-bench Real Tasks
Tested on real SWE-bench instances with actual repository code:
| Instance | Size | Compression |
|:---------|-----:|------------:|
| django__django-11620 | 4.5K | **14.5%** |
| sympy__sympy-14396 | 5.5K | **19.1%** |
| scikit-learn-25747 | 11.8K | **15.9%** |
| scikit-learn-13554 | 73K | **11.8%** |
| scikit-learn-25308 | 81K | **14.4%** |
### vs LLMLingua-2 (ROUGE-L Fidelity)
| Compression Rate | Claw Compactor | LLMLingua-2 | Delta |
|:-----------------|---------------:|------------:|------:|
| 0.3 (aggressive) | **0.653** | 0.346 | +88.2% |
| 0.5 (balanced) | **0.723** | 0.570 | +26.8% |
Claw Compactor preserves more semantic content at the same compression ratio, with zero LLM inference cost.
---
## Quick Start
### Install from PyPI
```bash
pip install claw-compactor
```
### Or clone from source
```bash
git clone https://github.com/open-compress/claw-compactor.git
cd claw-compactor
pip install -e .
```
### Run
```bash
# Benchmark your workspace (non-destructive)
claw-compactor benchmark /path/to/workspace
# Full compression pipeline
claw-compactor compress /path/to/workspace
```
**Requirements:** Python 3.9+. Optional: `pip install claw-compactor[accurate]` for exact token counts via tiktoken.
---
## API
### FusionEngine โ Single Text
```python
from scripts.lib.fusion.engine import FusionEngine
engine = FusionEngine()
result = engine.compress(
text="def hello():\n # greeting function\n print('hello')",
content_type="code", # or let Cortex auto-detect
language="python", # optional hint
)
print(result["compressed"]) # compressed output
print(result["stats"]) # per-stage timing + token counts
print(result["markers"]) # Rewind markers for reversibility
```
### FusionEngine โ Chat Messages
```python
messages = [
{"role": "system", "content": "You are a coding assistant..."},
{"role": "user", "content": "Fix the auth bug in login.py"},
{"role": "assistant", "content": "I found the issue. Here's the fix:\n```python\n..."},
{"role": "tool", "content": '{"results": [{"file": "login.py", ...}, ...]}'},
]
result = engine.compress_messages(messages)
# Cross-message dedup runs first, then per-message pipeline
print(result["stats"]["reduction_pct"]) # aggregate compression %
print(result["per_message"]) # per-message breakdown
```
### Rewind โ Reversible Retrieval
```python
engine = FusionEngine(enable_rewind=True)
result = engine.compress(large_json, content_type="json")
# LLM sees compressed output with markers like [rewind:abc123...]
# When the LLM needs the original, it calls the Rewind tool:
original = engine.rewind_store.retrieve("abc123def456...")
```
### Custom Stage
```python
from scripts.lib.fusion.base import FusionStage, FusionContext, FusionResult
class MyStage(FusionStage):
name = "my_compressor"
order = 22 # runs between StructuralCollapse (20) and Neurosyntax (25)
def should_apply(self, ctx: FusionContext) -> bool:
return ctx.content_type == "log"
def apply(self, ctx: FusionContext) -> FusionResult:
compressed = my_compression_logic(ctx.content)
return FusionResult(
content=compressed,
original_tokens=estimate_tokens(ctx.content),
compressed_tokens=estimate_tokens(compressed),
)
# Add to pipeline
pipeline = engine.pipeline.add(MyStage())
```
---
## The 14 Stages
| # | Stage | Order | Purpose | Applies To |
|:-:|:------|:-----:|:--------|:-----------|
| 1 | **QuantumLock** | 3 | Isolates dynamic content in system prompts to maximize KV-cache hit rate | system messages |
| 2 | **Cortex** | 5 | Auto-detects content type and programming language (16 languages) | untyped content |
| 3 | **Photon** | 8 | Detects and compresses base64-encoded images | all |
| 4 | **RLE** | 10 | Path shorthand (`$WS`), IP prefix compression, enum compaction | all |
| 5 | **SemanticDedup** | 12 | SimHash fingerprint deduplication across content blocks | all |
| 6 | **Ionizer** | 15 | JSON array statistical sampling with schema discovery + error preservation | json |
| 7 | **LogCrunch** | 16 | Folds repeated log lines with occurrence counts | log |
| 8 | **SearchCrunch** | 17 | Deduplicates search/grep results | search |
| 9 | **DiffCrunch** | 18 | Folds unchanged context lines in git diffs | diff |
| 10 | **StructuralCollapse** | 20 | Merges import blocks, collapses repeated assertions/patterns | code |
| 11 | **Neurosyntax** | 25 | AST-aware code compression via tree-sitter (safe regex fallback). Never shortens identifiers. | code |
| 12 | **Nexus** | 35 | ML token-level compression (stopword removal fallback without model) | text |
| 13 | **TokenOpt** | 40 | Tokenizer format optimization โ strips bold/italic markers, normalizes whitespace | all |
| 14 | **Abbrev** | 45 | Natural language abbreviation. Only fires on text โ never touches code, JSON, or structured data. | text |
Each stage is independent and stateless. Stages communicate only through the immutable `FusionContext` that flows forward through the pipeline.
---
## Workspace Commands
```bash
python3 scripts/mem_compress.py [options]
```
| Command | Description |
|:--------|:-----------|
| `full` | Run complete compression pipeline |
| `benchmark` | Dry-run compression report |
| `compress` | Rule-based compression only |
| `dict` | Dictionary encoding with auto-learned codebook |
| `observe` | Session transcript JSONL to structured observations |
| `tiers` | Generate L0/L1/L2 tiered summaries |
| `dedup` | Cross-file duplicate detection |
| `estimate` | Token count report |
| `audit` | Workspace health check |
| `optimize` | Tokenizer-level format optimization |
| `auto` | Watch mode โ compress on file changes |
Options: `--json`, `--dry-run`, `--since YYYY-MM-DD`, `--quiet`
---
## Architecture
See [ARCHITECTURE.md](ARCHITECTURE.md) for the full technical deep-dive:
- Immutable data flow design
- Stage execution model and gating
- Rewind reversible compression protocol
- Cross-message semantic deduplication
- How to extend the pipeline
```
12,000+ lines Python ยท 1,600+ tests ยท 14 fusion stages ยท 0 external ML dependencies
```
---
## Installation
```bash
# Clone
git clone https://github.com/open-compress/claw-compactor.git
cd claw-compactor
# Optional: exact token counting
pip install tiktoken
# Optional: AST-aware code compression (Neurosyntax)
pip install tree-sitter-language-pack
# Development
pip install -e ".[dev,accurate]"
```
**Zero required dependencies.** tiktoken and tree-sitter are optional enhancements โ the pipeline runs with built-in heuristic fallbacks for both.
---
## Who Uses This
| Project | How |
|:--------|:----|
| [OpenClaw](https://openclaw.ai) | Built-in skill for all OpenClaw AI agents โ compresses workspace context before every LLM call |
| [OpenCompress](https://opencompress.ai) | Production compression engine powering the OpenCompress API |
Using Claw Compactor? [Open a PR](https://github.com/open-compress/claw-compactor/pulls) to add yourself here.
---
## Project Stats
| Metric | Value |
|:-------|:------|
| Tests | 1,600+ passed |
| Python source | 12,000+ lines |
| Fusion stages | 14 |
| Languages detected | 16 |
| Required dependencies | 0 |
| Compression (code) | 15โ25% |
| Compression (JSON peak) | 81.9% |
| ROUGE-L @ 0.3 rate | 0.653 |
| License | MIT |
---
## Contributing
See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines on:
- Setting up the development environment
- Adding new Fusion stages
- Running the test suite
- Submitting PRs
---
## Related
- [OpenClaw](https://openclaw.ai) โ AI agent platform
- [ClawhubAI](https://clawhub.com) โ Agent skills marketplace
- [OpenClaw Discord](https://discord.com/invite/clawd) โ Community
- [OpenClaw Docs](https://docs.openclaw.ai) โ Documentation
- [Full Documentation](https://open-compress.github.io/claw-compactor) โ GitHub Pages docs
---
`token-compression` `llm-tools` `fusion-pipeline` `reversible-compression` `ast-code-analysis` `context-compression` `ai-agent` `openclaw` `python` `developer-tools`
## License
[MIT](LICENSE)