An open API service indexing awesome lists of open source software.

https://github.com/open-compress/claw-compactor

๐Ÿฆž LLM Token Compression & Reduction Tool โ€” Cut AI agent token costs by up to 97%. 6-layer deterministic context compression for AI agent workspaces. No LLM required. Prompt compression, context window optimization & cost reduction for any LLM pipeline.
https://github.com/open-compress/claw-compactor

ai-agent-tools ai-cost-saving ai-infrastructure claw-compactor context-compression context-pruning context-window-optimization developer-tools llm-compression llm-context-compression llm-cost-reduction llm-token-compression llm-tools openclaw prompt-compression python-tools token-compression token-optimization token-reduction token-saving

Last synced: 4 days ago
JSON representation

๐Ÿฆž LLM Token Compression & Reduction Tool โ€” Cut AI agent token costs by up to 97%. 6-layer deterministic context compression for AI agent workspaces. No LLM required. Prompt compression, context window optimization & cost reduction for any LLM pipeline.

Awesome Lists containing this project

README

          

# Claw Compactor

### 14-Stage Fusion Pipeline for LLM Token Compression

![Claw Compactor Banner](assets/banner.png)

[![CI](https://github.com/open-compress/claw-compactor/actions/workflows/ci.yml/badge.svg)](https://github.com/open-compress/claw-compactor/actions)
[![codecov](https://codecov.io/gh/open-compress/claw-compactor/graph/badge.svg)](https://codecov.io/gh/open-compress/claw-compactor)
[![Python](https://img.shields.io/badge/python-3.9%2B-blue)](https://python.org)
[![License](https://img.shields.io/badge/license-MIT-purple)](LICENSE)
[![PyPI](https://img.shields.io/pypi/v/claw-compactor?color=blue&label=PyPI)](https://pypi.org/project/claw-compactor/)
[![Downloads](https://img.shields.io/pypi/dm/claw-compactor?color=green&label=downloads)](https://pypi.org/project/claw-compactor/)
[![Stars](https://img.shields.io/github/stars/open-compress/claw-compactor?style=social)](https://github.com/open-compress/claw-compactor)

**15โ€“82% compression depending on content ยท Zero LLM inference cost ยท Reversible ยท 1600+ tests**

[Documentation](https://open-compress.github.io/claw-compactor) ยท [Architecture](ARCHITECTURE.md) ยท [Benchmarks](#benchmarks) ยท [Quick Start](#quick-start) ยท [API](#api)

---

## What is Claw Compactor?

Claw Compactor is an open-source **LLM token compression engine** built around a 14-stage **Fusion Pipeline**. Each stage is a specialized compressor โ€” from AST-aware code analysis to JSON statistical sampling to simhash-based deduplication โ€” chained through an immutable data flow architecture where each stage's output feeds the next.

### Demo

```
$ claw-compactor benchmark ./my-workspace

Claw Compactor v7.0 โ€” Fusion Pipeline Benchmark
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

Scanning workspace... 47 files, 234,891 tokens

Stage Results:
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Stage โ”‚ Applied โ”‚ Reduction โ”‚ Time โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Cortex โ”‚ 47/47 โ”‚ โ€” โ”‚ 12ms โ”‚
โ”‚ Photon โ”‚ 3/47 โ”‚ 2.1% โ”‚ 4ms โ”‚
โ”‚ RLE โ”‚ 41/47 โ”‚ 8.3% โ”‚ 6ms โ”‚
โ”‚ SemanticDedup โ”‚ 47/47 โ”‚ 12.7% โ”‚ 18ms โ”‚
โ”‚ Ionizer โ”‚ 8/47 โ”‚ 71.2% โ”‚ 9ms โ”‚
โ”‚ Neurosyntax โ”‚ 23/47 โ”‚ 18.4% โ”‚ 31ms โ”‚
โ”‚ TokenOpt โ”‚ 47/47 โ”‚ 4.1% โ”‚ 3ms โ”‚
โ”‚ Abbrev โ”‚ 12/47 โ”‚ 6.8% โ”‚ 5ms โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Summary:
Before: 234,891 tokens ($2.35 at GPT-4 rates)
After: 108,250 tokens ($1.08)
Saved: 126,641 tokens (53.9%) โ€” $1.27/run
Time: 88ms total

Estimated monthly savings at 100 runs/day: $3,810
```

---

## How It Compares

| Feature | Claw Compactor | LLMLingua-2 | SelectiveContext | gzip + base64 |
|:--------|:-:|:-:|:-:|:-:|
| Compression rate | 15โ€“82% | 30โ€“70% | 10โ€“40% | 60โ€“80% |
| ROUGE-L @ 0.3 | **0.653** | 0.346 | ~0.4 | N/A |
| ROUGE-L @ 0.5 | **0.723** | 0.570 | ~0.6 | N/A |
| LLM inference cost | **$0** | ~$0.02/call | **$0** | **$0** |
| Latency | **<50ms** | ~300ms | ~200ms | <10ms |
| Reversible | **Yes** | No | No | Yes (manual) |
| Content-aware routing | **14 stages** | 1 (perplexity) | 1 (self-info) | None |
| AST-aware code handling | **Yes** (tree-sitter) | No | No | No |
| JSON schema sampling | **Yes** | No | No | No |
| Log/diff/search stages | **Yes** | No | No | No |
| Required dependencies | **0** | torch, transformers | torch | zlib |
| LLM-readable output | **Yes** | Partial | Partial | **No** |

**Why Claw Compactor wins:** LLMLingua-2 drops tokens by perplexity score โ€” effective for natural language, but destroys code identifiers, JSON keys, and log patterns. Claw Compactor uses content-type-aware stages that understand the structure of what they're compressing.

---

```
Input
|
v
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ FUSION PIPELINE โ”‚
โ”‚ โ”‚
โ”‚ QuantumLock โ”€> Cortex โ”€> Photon โ”€> RLE โ”€> SemanticDedup โ”€> Ionizer โ”‚
โ”‚ | | | | | | โ”‚
โ”‚ KV-cache auto-detect base64 path simhash JSON โ”‚
โ”‚ alignment 16 languages strip shorten dedup sampling โ”‚
โ”‚ โ”‚
โ”‚ โ”€> LogCrunch โ”€> SearchCrunch โ”€> DiffCrunch โ”€> StructuralCollapse โ”‚
โ”‚ | | | | โ”‚
โ”‚ log folding result dedup context fold import merge โ”‚
โ”‚ โ”‚
โ”‚ โ”€> Neurosyntax โ”€> Nexus โ”€> TokenOpt โ”€> Abbrev โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€> Output โ”‚
โ”‚ | | | | โ”‚
โ”‚ AST compress ML token format NL shorten โ”‚
โ”‚ (tree-sitter) classify optimize (text only) โ”‚
โ”‚ โ”‚
โ”‚ [ RewindStore ] โ”€โ”€ hash-addressed LRU for reversible retrieval โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
```

Key design principles:

- **Immutable data flow** โ€” `FusionContext` is a frozen dataclass. Every stage produces a new `FusionResult`; nothing is mutated in-place.
- **Gate-before-compress** โ€” Each stage has `should_apply()` that inspects context type, language, and role before doing any work. Stages that don't apply are skipped at zero cost.
- **Content-aware routing** โ€” Cortex auto-detects content type (code, JSON, logs, diffs, search results) and language (Python, Go, Rust, TypeScript, etc.), then downstream stages make type-aware compression decisions.
- **Reversible compression** โ€” Ionizer stores originals in a hash-addressed `RewindStore`. The LLM can call a tool to retrieve any compressed section by its marker ID.

---

## Benchmarks

### Real-World Compression (FusionEngine v7 vs Legacy Regex)

| Content Type | Legacy | FusionEngine | Improvement |
|:-------------|-------:|-------------:|:-----------:|
| Python source | 7.3% | **25.0%** | 3.4x |
| JSON (100 items) | 12.6% | **81.9%** | 6.5x |
| Build logs | 5.5% | **24.1%** | 4.4x |
| Agent conversation | 5.7% | **31.0%** | 5.4x |
| Git diff | 6.2% | **15.0%** | 2.4x |
| Search results | 5.3% | **40.7%** | 7.7x |
| **Weighted average** | **9.2%** | **36.3%** | **3.9x** |

### SWE-bench Real Tasks

Tested on real SWE-bench instances with actual repository code:

| Instance | Size | Compression |
|:---------|-----:|------------:|
| django__django-11620 | 4.5K | **14.5%** |
| sympy__sympy-14396 | 5.5K | **19.1%** |
| scikit-learn-25747 | 11.8K | **15.9%** |
| scikit-learn-13554 | 73K | **11.8%** |
| scikit-learn-25308 | 81K | **14.4%** |

### vs LLMLingua-2 (ROUGE-L Fidelity)

| Compression Rate | Claw Compactor | LLMLingua-2 | Delta |
|:-----------------|---------------:|------------:|------:|
| 0.3 (aggressive) | **0.653** | 0.346 | +88.2% |
| 0.5 (balanced) | **0.723** | 0.570 | +26.8% |

Claw Compactor preserves more semantic content at the same compression ratio, with zero LLM inference cost.

---

## Quick Start

### Install from PyPI

```bash
pip install claw-compactor
```

### Or clone from source

```bash
git clone https://github.com/open-compress/claw-compactor.git
cd claw-compactor
pip install -e .
```

### Run

```bash
# Benchmark your workspace (non-destructive)
claw-compactor benchmark /path/to/workspace

# Full compression pipeline
claw-compactor compress /path/to/workspace
```

**Requirements:** Python 3.9+. Optional: `pip install claw-compactor[accurate]` for exact token counts via tiktoken.

---

## API

### FusionEngine โ€” Single Text

```python
from scripts.lib.fusion.engine import FusionEngine

engine = FusionEngine()

result = engine.compress(
text="def hello():\n # greeting function\n print('hello')",
content_type="code", # or let Cortex auto-detect
language="python", # optional hint
)

print(result["compressed"]) # compressed output
print(result["stats"]) # per-stage timing + token counts
print(result["markers"]) # Rewind markers for reversibility
```

### FusionEngine โ€” Chat Messages

```python
messages = [
{"role": "system", "content": "You are a coding assistant..."},
{"role": "user", "content": "Fix the auth bug in login.py"},
{"role": "assistant", "content": "I found the issue. Here's the fix:\n```python\n..."},
{"role": "tool", "content": '{"results": [{"file": "login.py", ...}, ...]}'},
]

result = engine.compress_messages(messages)

# Cross-message dedup runs first, then per-message pipeline
print(result["stats"]["reduction_pct"]) # aggregate compression %
print(result["per_message"]) # per-message breakdown
```

### Rewind โ€” Reversible Retrieval

```python
engine = FusionEngine(enable_rewind=True)
result = engine.compress(large_json, content_type="json")

# LLM sees compressed output with markers like [rewind:abc123...]
# When the LLM needs the original, it calls the Rewind tool:
original = engine.rewind_store.retrieve("abc123def456...")
```

### Custom Stage

```python
from scripts.lib.fusion.base import FusionStage, FusionContext, FusionResult

class MyStage(FusionStage):
name = "my_compressor"
order = 22 # runs between StructuralCollapse (20) and Neurosyntax (25)

def should_apply(self, ctx: FusionContext) -> bool:
return ctx.content_type == "log"

def apply(self, ctx: FusionContext) -> FusionResult:
compressed = my_compression_logic(ctx.content)
return FusionResult(
content=compressed,
original_tokens=estimate_tokens(ctx.content),
compressed_tokens=estimate_tokens(compressed),
)

# Add to pipeline
pipeline = engine.pipeline.add(MyStage())
```

---

## The 14 Stages

| # | Stage | Order | Purpose | Applies To |
|:-:|:------|:-----:|:--------|:-----------|
| 1 | **QuantumLock** | 3 | Isolates dynamic content in system prompts to maximize KV-cache hit rate | system messages |
| 2 | **Cortex** | 5 | Auto-detects content type and programming language (16 languages) | untyped content |
| 3 | **Photon** | 8 | Detects and compresses base64-encoded images | all |
| 4 | **RLE** | 10 | Path shorthand (`$WS`), IP prefix compression, enum compaction | all |
| 5 | **SemanticDedup** | 12 | SimHash fingerprint deduplication across content blocks | all |
| 6 | **Ionizer** | 15 | JSON array statistical sampling with schema discovery + error preservation | json |
| 7 | **LogCrunch** | 16 | Folds repeated log lines with occurrence counts | log |
| 8 | **SearchCrunch** | 17 | Deduplicates search/grep results | search |
| 9 | **DiffCrunch** | 18 | Folds unchanged context lines in git diffs | diff |
| 10 | **StructuralCollapse** | 20 | Merges import blocks, collapses repeated assertions/patterns | code |
| 11 | **Neurosyntax** | 25 | AST-aware code compression via tree-sitter (safe regex fallback). Never shortens identifiers. | code |
| 12 | **Nexus** | 35 | ML token-level compression (stopword removal fallback without model) | text |
| 13 | **TokenOpt** | 40 | Tokenizer format optimization โ€” strips bold/italic markers, normalizes whitespace | all |
| 14 | **Abbrev** | 45 | Natural language abbreviation. Only fires on text โ€” never touches code, JSON, or structured data. | text |

Each stage is independent and stateless. Stages communicate only through the immutable `FusionContext` that flows forward through the pipeline.

---

## Workspace Commands

```bash
python3 scripts/mem_compress.py [options]
```

| Command | Description |
|:--------|:-----------|
| `full` | Run complete compression pipeline |
| `benchmark` | Dry-run compression report |
| `compress` | Rule-based compression only |
| `dict` | Dictionary encoding with auto-learned codebook |
| `observe` | Session transcript JSONL to structured observations |
| `tiers` | Generate L0/L1/L2 tiered summaries |
| `dedup` | Cross-file duplicate detection |
| `estimate` | Token count report |
| `audit` | Workspace health check |
| `optimize` | Tokenizer-level format optimization |
| `auto` | Watch mode โ€” compress on file changes |

Options: `--json`, `--dry-run`, `--since YYYY-MM-DD`, `--quiet`

---

## Architecture

See [ARCHITECTURE.md](ARCHITECTURE.md) for the full technical deep-dive:
- Immutable data flow design
- Stage execution model and gating
- Rewind reversible compression protocol
- Cross-message semantic deduplication
- How to extend the pipeline

```
12,000+ lines Python ยท 1,600+ tests ยท 14 fusion stages ยท 0 external ML dependencies
```

---

## Installation

```bash
# Clone
git clone https://github.com/open-compress/claw-compactor.git
cd claw-compactor

# Optional: exact token counting
pip install tiktoken

# Optional: AST-aware code compression (Neurosyntax)
pip install tree-sitter-language-pack

# Development
pip install -e ".[dev,accurate]"
```

**Zero required dependencies.** tiktoken and tree-sitter are optional enhancements โ€” the pipeline runs with built-in heuristic fallbacks for both.

---

## Who Uses This

| Project | How |
|:--------|:----|
| [OpenClaw](https://openclaw.ai) | Built-in skill for all OpenClaw AI agents โ€” compresses workspace context before every LLM call |
| [OpenCompress](https://opencompress.ai) | Production compression engine powering the OpenCompress API |

Using Claw Compactor? [Open a PR](https://github.com/open-compress/claw-compactor/pulls) to add yourself here.

---

## Project Stats

| Metric | Value |
|:-------|:------|
| Tests | 1,600+ passed |
| Python source | 12,000+ lines |
| Fusion stages | 14 |
| Languages detected | 16 |
| Required dependencies | 0 |
| Compression (code) | 15โ€“25% |
| Compression (JSON peak) | 81.9% |
| ROUGE-L @ 0.3 rate | 0.653 |
| License | MIT |

---

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines on:
- Setting up the development environment
- Adding new Fusion stages
- Running the test suite
- Submitting PRs

---

## Related

- [OpenClaw](https://openclaw.ai) โ€” AI agent platform
- [ClawhubAI](https://clawhub.com) โ€” Agent skills marketplace
- [OpenClaw Discord](https://discord.com/invite/clawd) โ€” Community
- [OpenClaw Docs](https://docs.openclaw.ai) โ€” Documentation
- [Full Documentation](https://open-compress.github.io/claw-compactor) โ€” GitHub Pages docs

---

`token-compression` `llm-tools` `fusion-pipeline` `reversible-compression` `ast-code-analysis` `context-compression` `ai-agent` `openclaw` `python` `developer-tools`

## License

[MIT](LICENSE)