https://github.com/open-compress/claw-compactor

🦞 LLM Token Compression & Reduction Tool — Cut AI agent token costs by up to 97%. 6-layer deterministic context compression for AI agent workspaces. No LLM required. Prompt compression, context window optimization & cost reduction for any LLM pipeline.
https://github.com/open-compress/claw-compactor

ai-agent-tools ai-cost-saving ai-infrastructure claw-compactor context-compression context-pruning context-window-optimization developer-tools llm-compression llm-context-compression llm-cost-reduction llm-token-compression llm-tools openclaw prompt-compression python-tools token-compression token-optimization token-reduction token-saving

Last synced: 4 months ago
JSON representation

Host: GitHub
URL: https://github.com/open-compress/claw-compactor
Owner: open-compress
License: mit
Created: 2026-02-10T00:27:08.000Z (5 months ago)
Default Branch: main
Last Pushed: 2026-03-18T06:58:37.000Z (4 months ago)
Last Synced: 2026-03-18T07:48:40.744Z (4 months ago)
Topics: ai-agent-tools, ai-cost-saving, ai-infrastructure, claw-compactor, context-compression, context-pruning, context-window-optimization, developer-tools, llm-compression, llm-context-compression, llm-cost-reduction, llm-token-compression, llm-tools, openclaw, prompt-compression, python-tools, token-compression, token-optimization, token-reduction, token-saving
Language: Python
Homepage: https://www.opencompress.ai/
Size: 1.49 MB
Stars: 1,616
Watchers: 94
Forks: 138
Open Issues: 2
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Citation: CITATION.cff
- Security: SECURITY.md

Awesome Lists containing this project

awesome-infra-for-ai - open-compress/claw-compactor - Claw Compactor is an LLM token compression engine that uses a 14-stage Fusion Pipeline for content-aware, reversible compression to reduce LLM inference costs and optimize context windows. (Model Serving & Inference / Inference Optimization)
awesome-llm-token-reduction - claw-compactor - 14-stage reversible, AST-aware pipeline for LLM token compression with zero inference cost. ![Stars](https://img.shields.io/github/stars/open-compress/claw-compactor?style=social) (Prompt Compression Libraries)

README

          



# Claw Compactor

### 14-Stage Fusion Pipeline for LLM Token Compression

![Claw Compactor Banner](assets/banner.png)

[![CI](https://github.com/open-compress/claw-compactor/actions/workflows/ci.yml/badge.svg)](https://github.com/open-compress/claw-compactor/actions)

[![codecov](https://codecov.io/gh/open-compress/claw-compactor/graph/badge.svg)](https://codecov.io/gh/open-compress/claw-compactor)

[![Python](https://img.shields.io/badge/python-3.9%2B-blue)](https://python.org)

[![License](https://img.shields.io/badge/license-MIT-purple)](LICENSE)

[![PyPI](https://img.shields.io/pypi/v/claw-compactor?color=blue&label=PyPI)](https://pypi.org/project/claw-compactor/)

[![Downloads](https://img.shields.io/pypi/dm/claw-compactor?color=green&label=downloads)](https://pypi.org/project/claw-compactor/)

[![Stars](https://img.shields.io/github/stars/open-compress/claw-compactor?style=social)](https://github.com/open-compress/claw-compactor)

**15–82% compression depending on content · Zero LLM inference cost · Reversible · 1600+ tests**

[Documentation](https://open-compress.github.io/claw-compactor) · [Architecture](ARCHITECTURE.md) · [Benchmarks](#benchmarks) · [Quick Start](#quick-start) · [API](#api)



---

## What is Claw Compactor?

Claw Compactor is an open-source **LLM token compression engine** built around a 14-stage **Fusion Pipeline**. Each stage is a specialized compressor — from AST-aware code analysis to JSON statistical sampling to simhash-based deduplication — chained through an immutable data flow architecture where each stage's output feeds the next.

### Demo

```

$ claw-compactor benchmark ./my-workspace

  Claw Compactor v7.0 — Fusion Pipeline Benchmark

  ─────────────────────────────────────────────────

  Scanning workspace... 47 files, 234,891 tokens

  Stage Results:

  ┌──────────────────┬──────────┬───────────┬──────────┐

  │ Stage            │ Applied  │ Reduction │ Time     │

  ├──────────────────┼──────────┼───────────┼──────────┤

  │ Cortex           │ 47/47    │ —         │ 12ms     │

  │ Photon           │ 3/47     │ 2.1%      │ 4ms      │

  │ RLE              │ 41/47    │ 8.3%      │ 6ms      │

  │ SemanticDedup    │ 47/47    │ 12.7%     │ 18ms     │

  │ Ionizer          │ 8/47     │ 71.2%     │ 9ms      │

  │ Neurosyntax      │ 23/47    │ 18.4%     │ 31ms     │

  │ TokenOpt         │ 47/47    │ 4.1%      │ 3ms      │

  │ Abbrev           │ 12/47    │ 6.8%      │ 5ms      │

  └──────────────────┴──────────┴───────────┴──────────┘

  Summary:

    Before:  234,891 tokens ($2.35 at GPT-4 rates)

    After:   108,250 tokens ($1.08)

    Saved:   126,641 tokens (53.9%) — $1.27/run

    Time:    88ms total

  Estimated monthly savings at 100 runs/day: $3,810

```

---

## How It Compares

| Feature | Claw Compactor | LLMLingua-2 | SelectiveContext | gzip + base64 |

|:--------|:-:|:-:|:-:|:-:|

| Compression rate | 15–82% | 30–70% | 10–40% | 60–80% |

| ROUGE-L @ 0.3 | **0.653** | 0.346 | ~0.4 | N/A |

| ROUGE-L @ 0.5 | **0.723** | 0.570 | ~0.6 | N/A |

| LLM inference cost | **$0** | ~$0.02/call | **$0** | **$0** |

| Latency | **<50ms** | ~300ms | ~200ms | <10ms |

| Reversible | **Yes** | No | No | Yes (manual) |

| Content-aware routing | **14 stages** | 1 (perplexity) | 1 (self-info) | None |

| AST-aware code handling | **Yes** (tree-sitter) | No | No | No |

| JSON schema sampling | **Yes** | No | No | No |

| Log/diff/search stages | **Yes** | No | No | No |

| Required dependencies | **0** | torch, transformers | torch | zlib |

| LLM-readable output | **Yes** | Partial | Partial | **No** |

**Why Claw Compactor wins:** LLMLingua-2 drops tokens by perplexity score — effective for natural language, but destroys code identifiers, JSON keys, and log patterns. Claw Compactor uses content-type-aware stages that understand the structure of what they're compressing.

---

```

Input

  |

  v

┌─────────────────────────────────────────────────────────────────────────┐

│                         FUSION PIPELINE                                 │

│                                                                         │

│  QuantumLock ─> Cortex ─> Photon ─> RLE ─> SemanticDedup ─> Ionizer    │

│       |            |         |        |          |              |        │

│   KV-cache    auto-detect  base64   path     simhash       JSON         │

│   alignment   16 languages  strip  shorten   dedup        sampling      │

│                                                                         │

│  ─> LogCrunch ─> SearchCrunch ─> DiffCrunch ─> StructuralCollapse      │

│        |              |              |                |                  │

│    log folding    result dedup   context fold    import merge            │

│                                                                         │

│  ─> Neurosyntax ─> Nexus ─> TokenOpt ─> Abbrev ─────────> Output       │

│        |             |          |           |                            │

│    AST compress   ML token   format     NL shorten                      │

│    (tree-sitter)  classify   optimize   (text only)                     │

│                                                                         │

│  [ RewindStore ] ── hash-addressed LRU for reversible retrieval         │

└─────────────────────────────────────────────────────────────────────────┘

```

Key design principles:

- **Immutable data flow** — `FusionContext` is a frozen dataclass. Every stage produces a new `FusionResult`; nothing is mutated in-place.

- **Gate-before-compress** — Each stage has `should_apply()` that inspects context type, language, and role before doing any work. Stages that don't apply are skipped at zero cost.

- **Content-aware routing** — Cortex auto-detects content type (code, JSON, logs, diffs, search results) and language (Python, Go, Rust, TypeScript, etc.), then downstream stages make type-aware compression decisions.

- **Reversible compression** — Ionizer stores originals in a hash-addressed `RewindStore`. The LLM can call a tool to retrieve any compressed section by its marker ID.

---

## Benchmarks

### Real-World Compression (FusionEngine v7 vs Legacy Regex)

| Content Type | Legacy | FusionEngine | Improvement |

|:-------------|-------:|-------------:|:-----------:|

| Python source | 7.3% | **25.0%** | 3.4x |

| JSON (100 items) | 12.6% | **81.9%** | 6.5x |

| Build logs | 5.5% | **24.1%** | 4.4x |

| Agent conversation | 5.7% | **31.0%** | 5.4x |

| Git diff | 6.2% | **15.0%** | 2.4x |

| Search results | 5.3% | **40.7%** | 7.7x |

| **Weighted average** | **9.2%** | **36.3%** | **3.9x** |

### SWE-bench Real Tasks

Tested on real SWE-bench instances with actual repository code:

| Instance | Size | Compression |

|:---------|-----:|------------:|

| django__django-11620 | 4.5K | **14.5%** |

| sympy__sympy-14396 | 5.5K | **19.1%** |

| scikit-learn-25747 | 11.8K | **15.9%** |

| scikit-learn-13554 | 73K | **11.8%** |

| scikit-learn-25308 | 81K | **14.4%** |

### vs LLMLingua-2 (ROUGE-L Fidelity)

| Compression Rate | Claw Compactor | LLMLingua-2 | Delta |

|:-----------------|---------------:|------------:|------:|

| 0.3 (aggressive) | **0.653** | 0.346 | +88.2% |

| 0.5 (balanced) | **0.723** | 0.570 | +26.8% |

Claw Compactor preserves more semantic content at the same compression ratio, with zero LLM inference cost.

---

## Quick Start

### Install from PyPI

```bash

pip install claw-compactor

```

### Or clone from source

```bash

git clone https://github.com/open-compress/claw-compactor.git

cd claw-compactor

pip install -e .

```

### Run

```bash

# Benchmark your workspace (non-destructive)

claw-compactor benchmark /path/to/workspace

# Full compression pipeline

claw-compactor compress /path/to/workspace

```

**Requirements:** Python 3.9+. Optional: `pip install claw-compactor[accurate]` for exact token counts via tiktoken.

---

## API

### FusionEngine — Single Text

```python

from scripts.lib.fusion.engine import FusionEngine

engine = FusionEngine()

result = engine.compress(

    text="def hello():\n    # greeting function\n    print('hello')",

    content_type="code",    # or let Cortex auto-detect

    language="python",      # optional hint

)

print(result["compressed"])     # compressed output

print(result["stats"])          # per-stage timing + token counts

print(result["markers"])        # Rewind markers for reversibility

```

### FusionEngine — Chat Messages

```python

messages = [

    {"role": "system", "content": "You are a coding assistant..."},

    {"role": "user", "content": "Fix the auth bug in login.py"},

    {"role": "assistant", "content": "I found the issue. Here's the fix:\n```python\n..."},

    {"role": "tool", "content": '{"results": [{"file": "login.py", ...}, ...]}'},

]

result = engine.compress_messages(messages)

# Cross-message dedup runs first, then per-message pipeline

print(result["stats"]["reduction_pct"])   # aggregate compression %

print(result["per_message"])              # per-message breakdown

```

### Rewind — Reversible Retrieval

```python

engine = FusionEngine(enable_rewind=True)

result = engine.compress(large_json, content_type="json")

# LLM sees compressed output with markers like [rewind:abc123...]

# When the LLM needs the original, it calls the Rewind tool:

original = engine.rewind_store.retrieve("abc123def456...")

```

### Custom Stage

```python

from scripts.lib.fusion.base import FusionStage, FusionContext, FusionResult

class MyStage(FusionStage):

    name = "my_compressor"

    order = 22  # runs between StructuralCollapse (20) and Neurosyntax (25)

    def should_apply(self, ctx: FusionContext) -> bool:

        return ctx.content_type == "log"

    def apply(self, ctx: FusionContext) -> FusionResult:

        compressed = my_compression_logic(ctx.content)

        return FusionResult(

            content=compressed,

            original_tokens=estimate_tokens(ctx.content),

            compressed_tokens=estimate_tokens(compressed),

        )

# Add to pipeline

pipeline = engine.pipeline.add(MyStage())

```

---

## The 14 Stages

| # | Stage | Order | Purpose | Applies To |

|:-:|:------|:-----:|:--------|:-----------|

| 1 | **QuantumLock** | 3 | Isolates dynamic content in system prompts to maximize KV-cache hit rate | system messages |

| 2 | **Cortex** | 5 | Auto-detects content type and programming language (16 languages) | untyped content |

| 3 | **Photon** | 8 | Detects and compresses base64-encoded images | all |

| 4 | **RLE** | 10 | Path shorthand (`$WS`), IP prefix compression, enum compaction | all |

| 5 | **SemanticDedup** | 12 | SimHash fingerprint deduplication across content blocks | all |

| 6 | **Ionizer** | 15 | JSON array statistical sampling with schema discovery + error preservation | json |

| 7 | **LogCrunch** | 16 | Folds repeated log lines with occurrence counts | log |

| 8 | **SearchCrunch** | 17 | Deduplicates search/grep results | search |

| 9 | **DiffCrunch** | 18 | Folds unchanged context lines in git diffs | diff |

| 10 | **StructuralCollapse** | 20 | Merges import blocks, collapses repeated assertions/patterns | code |

| 11 | **Neurosyntax** | 25 | AST-aware code compression via tree-sitter (safe regex fallback). Never shortens identifiers. | code |

| 12 | **Nexus** | 35 | ML token-level compression (stopword removal fallback without model) | text |

| 13 | **TokenOpt** | 40 | Tokenizer format optimization — strips bold/italic markers, normalizes whitespace | all |

| 14 | **Abbrev** | 45 | Natural language abbreviation. Only fires on text — never touches code, JSON, or structured data. | text |

Each stage is independent and stateless. Stages communicate only through the immutable `FusionContext` that flows forward through the pipeline.

---

## Workspace Commands

```bash

python3 scripts/mem_compress.py   [options]

```

| Command | Description |

|:--------|:-----------|

| `full` | Run complete compression pipeline |

| `benchmark` | Dry-run compression report |

| `compress` | Rule-based compression only |

| `dict` | Dictionary encoding with auto-learned codebook |

| `observe` | Session transcript JSONL to structured observations |

| `tiers` | Generate L0/L1/L2 tiered summaries |

| `dedup` | Cross-file duplicate detection |

| `estimate` | Token count report |

| `audit` | Workspace health check |

| `optimize` | Tokenizer-level format optimization |

| `auto` | Watch mode — compress on file changes |

Options: `--json`, `--dry-run`, `--since YYYY-MM-DD`, `--quiet`

---

## Architecture

See [ARCHITECTURE.md](ARCHITECTURE.md) for the full technical deep-dive:

- Immutable data flow design

- Stage execution model and gating

- Rewind reversible compression protocol

- Cross-message semantic deduplication

- How to extend the pipeline

```

12,000+ lines Python  ·  1,600+ tests  ·  14 fusion stages  ·  0 external ML dependencies

```

---

## Installation

```bash

# Clone

git clone https://github.com/open-compress/claw-compactor.git

cd claw-compactor

# Optional: exact token counting

pip install tiktoken

# Optional: AST-aware code compression (Neurosyntax)

pip install tree-sitter-language-pack

# Development

pip install -e ".[dev,accurate]"

```

**Zero required dependencies.** tiktoken and tree-sitter are optional enhancements — the pipeline runs with built-in heuristic fallbacks for both.

---

## Who Uses This

| Project | How |

|:--------|:----|

| [OpenClaw](https://openclaw.ai) | Built-in skill for all OpenClaw AI agents — compresses workspace context before every LLM call |

| [OpenCompress](https://opencompress.ai) | Production compression engine powering the OpenCompress API |

Using Claw Compactor? [Open a PR](https://github.com/open-compress/claw-compactor/pulls) to add yourself here.

---

## Project Stats

| Metric | Value |

|:-------|:------|

| Tests | 1,600+ passed |

| Python source | 12,000+ lines |

| Fusion stages | 14 |

| Languages detected | 16 |

| Required dependencies | 0 |

| Compression (code) | 15–25% |

| Compression (JSON peak) | 81.9% |

| ROUGE-L @ 0.3 rate | 0.653 |

| License | MIT |

---

## Contributing

See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines on:

- Setting up the development environment

- Adding new Fusion stages

- Running the test suite

- Submitting PRs

---

## Related

- [OpenClaw](https://openclaw.ai) — AI agent platform

- [ClawhubAI](https://clawhub.com) — Agent skills marketplace

- [OpenClaw Discord](https://discord.com/invite/clawd) — Community

- [OpenClaw Docs](https://docs.openclaw.ai) — Documentation

- [Full Documentation](https://open-compress.github.io/claw-compactor) — GitHub Pages docs

---

`token-compression` `llm-tools` `fusion-pipeline` `reversible-compression` `ast-code-analysis` `context-compression` `ai-agent` `openclaw` `python` `developer-tools`

## License

[MIT](LICENSE)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/open-compress/claw-compactor

Awesome Lists containing this project

README