An open API service indexing awesome lists of open source software.

https://github.com/rahadiana/opencode-ultrapress

UltraPress saves context window tokens through 4 compression layers that run automatically in the background β€” from CLI output filtering, semantic compression, dynamic context pruning, to auto-cleanup. Your LLM stays smart, tokens stay lean.
https://github.com/rahadiana/opencode-ultrapress

opencode opencode-plugin

Last synced: 2 days ago
JSON representation

UltraPress saves context window tokens through 4 compression layers that run automatically in the background β€” from CLI output filtering, semantic compression, dynamic context pruning, to auto-cleanup. Your LLM stays smart, tokens stay lean.

Awesome Lists containing this project

README

          


UltraPress Banner

# πŸš€ OpenCode UltraPress

**Token Compression Plugin for OpenCode AI**

[![CI](https://github.com/rahadiana/opencode-ultrapress/actions/workflows/ci.yml/badge.svg)](https://github.com/rahadiana/opencode-ultrapress/actions/workflows/ci.yml)
[![npm](https://img.shields.io/npm/v/@rahadiana/opencode-ultrapress?color=red)](https://www.npmjs.com/package/@rahadiana/opencode-ultrapress)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

> **UltraPress** saves context window tokens through 4 compression layers that run automatically in the background β€” from CLI output filtering, semantic compression, dynamic context pruning, to auto-cleanup. Your LLM stays smart, tokens stay lean.

---

## πŸ“‘ Table of Contents

- [⚑ Installation & Setup](#-installation--setup)
- [System Requirements](#system-requirements)
- [Install from GitHub](#1-install-the-plugin)
- [Register to OpenCode](#2-register-to-opencode)
- [Personal Configuration](#3-optional-create-personal-configuration)
- [Verify Installation](#verify-installation)
- [Uninstall](#uninstall)
- [πŸ›  4-Layer Architecture](#-4-layer-architecture)
- [Pipeline Flow](#pipeline-flow)
- [Layer 1 β€” Smart Output Filter](#layer-1--smart-output-filter)
- [Layer 2 β€” GSC Semantic Compression](#layer-2--gsc-semantic-compression)
- [Layer 3 β€” Dynamic Context Pruning (DCP)](#layer-3--dynamic-context-pruning-dcp)
- [Layer 4 β€” Session Auto-Cleanup](#layer-4--session-auto-cleanup)
- [βš™οΈ Configuration](#️-configuration)
- [Full documentation β†’](./docs/konfigurasi-lengkap.md)
- [⌨️ `/up` Slash Command](#️-up-slash-command)
- [Sub-command List](#sub-command-list)
- [Example Output](#example-output)
- [❷ MLM & NLP Support](#-mlm--nlp-support)
- [NLP Mode (Default)](#nlp-mode-default)
- [MLM Mode (Experimental)](#mlm-mode-experimental)
- [Mode Comparison](#mode-comparison)
- [πŸ— Code Architecture](#-code-architecture)
- [Directory Structure](#directory-structure)
- [Hook Registration Map](#hook-registration-map)
- [Data Flow Detail](#data-flow-detail)
- [πŸ§ͺ Testing](#-testing)
- [πŸ“Š Benchmark](#-benchmark)
- [πŸš€ Local Development](#-local-development)
- [❓ FAQ & Troubleshooting](#-faq--troubleshooting)
- [πŸ—Ί Roadmap](#-roadmap)
- [🀝 Contributing](#-contributing)
- [πŸ“ Changelog](#-changelog)
- [πŸ“„ License](#-license)

---

## ⚑ Installation & Setup

### System Requirements

| Dependency | Minimum Version | Notes |
| :--- | :--- | :--- |
| **Node.js** | `>= 18` | Node 22 LTS recommended |
| **OpenCode AI** | Latest | Uses `@opencode-ai/plugin ^1.14` |
| **Git** | Any | Required for GitHub install |
| **Bun** | Latest | For development/testing only |
| **@huggingface/transformers** | Auto-install | Only used when `mlm` mode is active |

### Compatibility Matrix (Runtime)

| Environment | NLP (default) | MLM | LLM | Notes |
| :--- | :---: | :---: | :---: | :--- |
| macOS (Intel/Apple Silicon) | βœ… | βœ… | βœ… | Best overall support |
| Linux | βœ… | βœ… | βœ… | Good for server/workstation |
| WSL2 | βœ… | βœ… | βœ… | Prefer enough RAM for MLM/LLM |
| Windows | βœ… | βœ… | βœ… | Works via Node runtime |
| Termux / low-resource mobile shell | βœ… | ⚠️ | ⚠️ | Prefer NLP mode to avoid model load overhead |

Legend: βœ… recommended Β· ⚠️ possible but resource-sensitive

### Cross-Platform Notes

UltraPress is pure TypeScript and works across all platforms OpenCode supports:

- macOS
- Linux
- WSL / Termux
- Windows

What to expect:

- `mlm` / `llm` modes may download model assets on first use (larger disk/RAM footprint).
- If your environment is resource-limited (e.g., small VPS/Termux), use default **Balanced (NLP)** or force `semantic.mode: "nlp"` for zero-model operation.

### 1. Install the Plugin

```bash
# βœ… RECOMMENDED β€” auto-registers with OpenCode
opencode plugin add @rahadiana/opencode-ultrapress@latest --global
```

> ⚠️ **Important:** OpenCode caches the plugin at `~/.cache/opencode/packages/`. `@latest` is resolved **once** on first install β€” subsequent versions won't auto-update. To upgrade:
>
> ```bash
> rm -rf ~/.cache/opencode/packages/@rahadiana/opencode-ultrapress@latest
> opencode plugin add @rahadiana/opencode-ultrapress@latest --global
> ```
>
> The plugin will warn you at startup if a newer version is available.

**Alternative install methods** (not recommended for end users):

```bash
# Via npm β€” requires manual registration (step 2), same cache caveat applies
npm install -g @rahadiana/opencode-ultrapress

# GitHub latest β€” for testing pre-release changes
npm install -g github:rahadiana/opencode-ultrapress
```

**For plugin development:**

```bash
git clone https://github.com/rahadiana/opencode-ultrapress.git
cd opencode-ultrapress
npm install
npm run build

# Link globally so OpenCode can find it
npm link

# Then in OpenCode's config (~/.config/opencode/config.json), add:
# { "plugins": ["@rahadiana/opencode-ultrapress"] }
#
# After making code changes, re-run:
# npm run build
#
# No need to re-link β€” OpenCode loads from the linked directory.
# If using opencode plugin add, remove the cached version first:
# rm -rf ~/.cache/opencode/packages/@rahadiana/opencode-ultrapress@latest
```

### 2. Register to OpenCode (npm install only)

If you installed via `opencode plugin`, skip this step β€” registration is automatic.

Add the plugin to OpenCode's config at `~/.config/opencode/config.json`:

```json
{
"plugins": ["@rahadiana/opencode-ultrapress"]
}
```

### 3. (Optional) Create Personal Configuration

UltraPress works out-of-the-box with **Balanced defaults**. For customization:

```bash
# Install from GitHub β†’ copy from the cloned repo
cp ultrapress.plugin.jsonc.example ~/.config/opencode/ultrapress.plugin.json

# Or from global install
cp $(npm root -g)/@rahadiana/opencode-ultrapress/ultrapress.plugin.json.example ~/.config/opencode/ultrapress.plugin.json
```

Then edit `~/.config/opencode/ultrapress.plugin.json` as needed. If the file is not found, UltraPress will automatically create it with default values on first run.

### Quick-Start Profiles by Device

Use this as a practical starting point:

| Device / Environment | Recommended Mode | Why |
| :--- | :--- | :--- |
| Low-RAM machine / Termux / tiny VPS | `semantic.mode: "nlp"` | Zero model download, lowest RAM/CPU overhead |
| Typical laptop/dev machine | **Balanced defaults** | Best trade-off: token savings + context safety |
| High-RAM workstation | `mlm` or `llm` (optional) | Higher semantic quality, more resource usage |

Minimal override examples:

```jsonc
// Low-resource profile
{
"semantic": { "mode": "nlp" }
}
```

```jsonc
// Higher-quality semantic profile (requires more RAM)
{
"semantic": {
"mode": "mlm",
"model": "Xenova/all-MiniLM-L6-v2"
}
}
```

### Verify Installation

Restart OpenCode, then type in chat:

```
/up stats
```

If a statistics dashboard appears, UltraPress is active. If not:
1. Installed via `npm install`? Make sure the plugin is in `config.json` β†’ `"plugins": ["@rahadiana/opencode-ultrapress"]`
2. Installed via `opencode plugin`? Run `opencode plugin list` to confirm it's registered.
2. Check OpenCode logs for errors
3. Ensure Node.js >= 18 is installed (`node --version`)

### Uninstall

```bash
# 1. Remove from OpenCode plugin list
# Edit ~/.config/opencode/config.json β€” remove "@rahadiana/opencode-ultrapress" from plugins array

# 2. Purge the cached version
rm -rf ~/.cache/opencode/packages/@rahadiana/opencode-ultrapress@latest

# 3. If installed via npm (global)
npm uninstall -g @rahadiana/opencode-ultrapress

# 4. If using npm link
npm unlink -g @rahadiana/opencode-ultrapress

# 5. Clean up config
rm ~/.config/opencode/ultrapress.plugin.json
```

---

## πŸ›  4-Layer Architecture

UltraPress intercepts OpenCode message flow at 4 different points, each with a specific compression strategy.

### Pipeline Flow

```mermaid
flowchart LR
A([Tool Output]) -->|"tool.execute.after"| L1[Layer 1\nOutput Filter]
B([Chat Message]) -->|"chat.message"| PRUNE{Prune\nPending?}
PRUNE -->|yes| REMOVE[Remove msgs\n& inject summary]
PRUNE -->|no| L2[Layer 2\nGSC Semantic]
REMOVE --> L2
L2 --> L3[Layer 3\nNudge Monitor]
L3 -->|"nudge injected"| LLM{LLM}
LLM -->|"calls tool"| DCP_TOOL[ultrapress_compress]
DCP_TOOL -.->|"block stored"| PRUNE
D([Session Compact]) -->|"session.compacting"| L4[Layer 4\nCleanup]

L1 & L2 & L4 --> CTX[(Context\nWindow)]
```

---

### Layer 1 β€” Smart Output Filter

> **Hook**: `tool.execute.after` Β· **File**: `layer1-output-filter.ts` Β· **Filters**: `src/filters/`

Intercepts CLI tool output **before** it enters the context window. The most aggressive layer β€” directly cuts unnecessary logs.

**Core Strategies:**

| Strategy | Description |
| :--- | :--- |
| **Domain Routing** | Each CLI tool is routed to a specific filter: `git`, `npm/node`, `pytest/jest`, and filesystem. Unknown tools go to the generic filter. |
| **Middle-out Truncation** | Truncates logs from the middle, preserving the beginning (context) and end (error/result). Smarter than head/tail truncation. |
| **Deduplication** | Removes identical repeated log lines in real-time. Very effective for build & test logs. |
| **Tee Save** | If output is truncated, the original log is saved to a temporary `.log` file so it can still be accessed if needed. |

**Built-in Filters:**

| Filter | File | Trigger Tools |
| :--- | :--- | :--- |
| **Git** | `filters/git.ts` | `git diff`, `git log`, `git show` β€” remove redundant diff hunks, keep summary |
| **Test** | `filters/test.ts` | `pytest`, `jest`, `vitest`, `mocha` β€” summarize failure output, remove passing tests |
| **Bash** | `filters/bash.ts` | Generic shell output β€” dedup lines, middle-out truncation |
| **Filesystem** | `filters/fs.ts` | `ls`, `cat`, `find` β€” limit file count, truncate long content |
| **Generic** | `filters/generic.ts` | Fallback for all other tools β€” middle-out truncation + dedup |

---

### Layer 2 β€” GSC Semantic Compression

> **Hook**: `chat.message` Β· **File**: `layer2-caveman.ts` Β· **Engine**: `src/caveman/`

Compresses message text **semantically** β€” removes unimportant words without changing meaning. Layer 2 and Layer 3 **do not compress each other** β€” no double compression.

**Compression Rules:**

- Conjunctions (`that`, `and`, `will`, `which`) β†’ removed
- Excessive pronouns β†’ condensed
- Redundancy ("I think I will" β†’ "I will") β†’ removed
- Double spaces, unnecessary whitespace β†’ normalized
- **Code blocks** (inside ` ``` `) β†’ **NEVER** touched
- **Error messages & stack traces** β†’ fully protected
- Messages < 200 characters β†’ skipped

**Operating Modes:**

See [❷ MLM & NLP Support](#-mlm--nlp-support) for detailed NLP vs MLM comparison.

---

### Layer 3 β€” Dynamic Context Pruning (DCP)

> **Hook**: `chat.message` (pruning) + `tool.execute.after` (compress tool) Β· **File**: `layer3-dcp.ts` Β· **Engine**: `src/dcp/`

The most advanced system: **gives LLM autonomy to manage its own memory**. Unlike Layer 2 which only compresses text, Layer 3 **actually removes old messages from the context window** and replaces them with summaries.

**Mechanism:**

```
1. Context Monitor β†’ detect tokens approaching maxContextLimit
2. Autonomous Nudge β†’ inject prompt into user message: "context window nearly full, call ultrapress_compress"
3. LLM calls β†’ ultrapress_compress(mode="range", from=, to=)
4. Compression Block β†’ stored in memory (compress-state.ts)
5. chat.message hook β†’ check pending blocks β†’ remove messages in range β†’ inject summary as synthetic message
```

**Key Features:**

| Feature | Description |
| :--- | :--- |
| **Block-based Pruning** | When `ultrapress_compress` is called, LLM determines the message range to summarize. Block is stored, then executed on the **next** chat (not current β€” avoids race condition). |
| **prune via `chat.message` Hook** | Each new message β†’ plugin checks pending blocks β†’ removes messages in range from context array β†’ injects summary. |
| **Protected Content** | Critical tool output (`task`, `skill`, `todowrite`, `todoread`, `write`, `edit`, `ultrapress_compress`) is protected from pruning. |
| **Marker Protection** | Messages containing `TODO`, `FIXME`, `HACK`, `ACTION ITEM`, `ROOT CAUSE`, `RCA`, `DECISION`, `BLOCKER` are preserved from pruning. |
| **Nesting Support** | Compression can be done on top of previous compression. Nested summaries are auto-merged. |
| **`preserveLastN`** | Protects the **last** N messages from pruning β€” keeps recent conversation context intact. Default: `4`. Set to `0` to disable. |
| **Multi-Signal Scoring** | In addition to `preserveLastN`, each message is scored from 5 signals (recency, role, tool type, keyword, content size). High-scoring messages are preserved even in old blocks. Default: `0.45` (balanced). |
| **Reversible Compression** | `ultrapress_expand` tool β€” LLM can "expand" previously summarized blocks to see the original content. Original content is stored in plugin memory (not in LLM context). |
| **Nudge @70%** | Nudge is sent when context reaches 70% limit (not 100%), giving LLM time to compress before context is truly full. |
| **summaryBuffer** | After pruning, provides breathing room (no immediate re-nudge). |

**Two Pruning Modes:**

| Mode | Description |
| :--- | :--- |
| `range` | Range-based compression: choose from_id and to_id. All messages in between are summarized into one. |
| `message` | Surgical compression: choose one or more specific message IDs to summarize. |

---

### Layer 4 β€” Session Auto-Cleanup

> **Hook**: `tool.execute.after` + `session.compacting` Β· **File**: `layer4-cleanup.ts`

Automatically cleans "garbage" from the context window.

**Features:**

| Feature | Description |
| :--- | :--- |
| **Error Purging** | Removes error/failed tool messages after N chat turns (default: 4 turns). Stale errors only waste tokens. |
| **Tool-Call Dedup** | Prevents LLM from repeating identical tool calls (same tool + args) in the same session. |

---

## βš™οΈ Configuration

> πŸ“– **Full configuration documentation** (all keys, types, defaults, examples, presets, custom filters, troubleshooting) at:
> [`docs/konfigurasi-lengkap.md`](./docs/konfigurasi-lengkap.md)

### Basic Structure

File: `~/.config/opencode/ultrapress.plugin.json`

```jsonc
{
"enabled": true, // Master switch
"notification": "minimal", // "off" | "minimal" | "detailed"

"outputFilter": {},
"semantic": {},
"summarization": {},
"cleanup": {}
}
```

| Layer | Key | Function |
| :--- | :--- | :--- |
| **L1** | `outputFilter` | Limit CLI output length, filter repetitive lines |
| **L2** | `semantic` | NLP/MLM text compression without destroying meaning |
| **L3** | `summarization` | Remove old messages, replace with summary, `preserveLastN` protection |
| **L4** | `cleanup` | Dedup tool calls, auto-purge stale errors |

> πŸ”’ **Safety guard**: `task` is always enforced in `outputFilter.skipTools` and `semantic.skipTools` even if removed in user config. This prevents accidental sub-agent context loss.

> πŸ‘‰ **[Open full documentation β†’](./docs/konfigurasi-lengkap.md)** covers all keys, types, defaults, custom filters, presets (Balanced / Aggressive / Conservative / NLP-only), and troubleshooting.

---

## ⌨️ `/up` Slash Command

All interactions with UltraPress via a single command: `/up`.

### Sub-command List

| Command | Alias | Description |
| :--- | :--- | :--- |
| `/up stats` | `s`, `stat` | Current session token savings dashboard |
| `/up context` | `c`, `ctx` | Context window status: capacity, limit, remaining |
| `/up compress` | `comp` | Show layer status + compression guide |
| `/up help` | `h`, `?` | Command help |

**Note**: Sub-commands are case-insensitive and support partial fuzzy matching.

### Example Output

**`/up stats`**:

```
πŸ“Š ULTRAPRESS STATS
──────────────────────────────────────────
Raw tokens : 127,450
Compressed tokens: 89,215
Tokens saved : 38,235 (30.0%)

By Layer:
L1 Output Filter : 18,400
L2 Semantic : 12,100
L3 Summarization : 5,835
L4 Cleanup : 1,900

Activity:
Compressions : 3
Deduplications: 12
Errors purged: 2

Session: 2h 15m
──────────────────────────────────────────
```

**`/up context`**:

```
🧠 CONTEXT STATUS
──────────────────────────────────────────
Current tokens : ~52,000
Max context limit : 70,000
Available : ~18,000 (25.7%)
Nudge threshold : 40,000
Status : 🟑 Nearing limit (nudge will fire soon)
Next nudge in : 3 turns
──────────────────────────────────────────
```

---

## ❷ MLM & NLP Support

### NLP Mode (Default)

Rule-based grammar stripping using linguistic rules. **Zero latency**, no external model required.

**How it works:**
1. Detect sentence structure (subject, predicate, object)
2. Remove conjunctions, excessive pronouns, filler words
3. Condense redundancy without changing meaning
4. Protect code blocks & error messages

### MLM Mode (Experimental)

Uses **Masked Language Model** via `@huggingface/transformers` (Transformers.js) for more accurate tokenization.

**Activation:**

```json
{
"semantic": {
"mode": "mlm",
"model": "Xenova/distilbert-base-uncased"
}
}
```

**Important Notes:**
- ⚠️ Model auto-downloaded on first run (~70MB for distilbert-base)
- ⚠️ First-run latency 5-15 seconds for model loading
- ⚠️ RAM usage increases ~200MB when model is active
- ⚠️ Compatibility: CPU-only (no GPU required)
- 🌐 For Indonesian: use `Xenova/bert-base-multilingual-uncased`

### Mode Comparison

| Aspect | NLP | MLM | LLM |
| :--- | :--- | :--- | :--- |
| **Latency** | < 1ms | 50-200ms | 1-5s |
| **RAM** | 0 MB | ~70 MB (q8) | ~300 MB (q8) |
| **Accuracy** | ~85% | ~95% | ~99% |
| **Language** | Indonesian + English | 100+ languages | All |
| **Internet Connection** | ❌ Not needed | ❌ Initial download only | ❌ Initial download only |
| **Stable** | βœ… | ⚠️ Experimental | ⚠️ Experimental |
| **Model** | β€” | `all-MiniLM-L6-v2` | `t5-small` (summarization) |

---

## πŸ— Code Architecture

### Directory Structure

```
opencode-ultrapress/
β”œβ”€β”€ src/
β”‚ β”œβ”€β”€ index.ts # Entry point, hook registration, plugin server
β”‚ β”œβ”€β”€ config/
β”‚ β”‚ β”œβ”€β”€ schema.ts # TypeScript type definitions (UltraPressConfig, etc.)
β”‚ β”‚ └── defaults.ts # Default config values + merge logic
β”‚ β”œβ”€β”€ layers/
β”‚ β”‚ β”œβ”€β”€ layer1-output-filter.ts # RTK engine β€” routes tool output to domain filters
β”‚ β”‚ β”œβ”€β”€ layer2-caveman.ts # Semantic compression orchestrator
β”‚ β”‚ β”œβ”€β”€ layer3-dcp.ts # DCP orchestrator β€” nudge injection, pruning trigger
β”‚ β”‚ └── layer4-cleanup.ts # Auto-cleanup β€” dedup + error purging
β”‚ β”œβ”€β”€ filters/
β”‚ β”‚ β”œβ”€β”€ git.ts # Git-specific output filter
β”‚ β”‚ β”œβ”€β”€ test.ts # Test runner output filter
β”‚ β”‚ β”œβ”€β”€ bash.ts # Shell output filter
β”‚ β”‚ β”œβ”€β”€ fs.ts # Filesystem tool output filter
β”‚ β”‚ └── generic.ts # Fallback filter
β”‚ β”œβ”€β”€ dcp/
β”‚ β”‚ β”œβ”€β”€ compress-state.ts # In-memory state for pending compression blocks
β”‚ β”‚ β”œβ”€β”€ compress-tool.ts # ultrapress_compress tool definition & handler
β”‚ β”‚ β”œβ”€β”€ context-monitor.ts # Token usage monitoring + nudge logic
β”‚ β”‚ β”œβ”€β”€ prune.ts # Message removal + summary injection engine
β”‚ β”‚ β”œβ”€β”€ protected-content.ts # Defines which tool outputs are protected
β”‚ β”‚ └── summary-store.ts # Stores summaries for nesting support
β”‚ β”œβ”€β”€ caveman/
β”‚ β”‚ β”œβ”€β”€ nlp.ts # Rule-based NLP compressor
β”‚ β”‚ └── mlm.ts # MLM-based compressor (Transformers.js)
β”‚ β”œβ”€β”€ commands/
β”‚ β”‚ └── slash.ts # /up slash command handler
β”‚ └── utils/
β”‚ β”œβ”€β”€ token-count.ts # Token estimation (char-based approximation)
β”‚ └── logger.ts # Logging with configurable verbosity
β”œβ”€β”€ tests/
β”‚ β”œβ”€β”€ layer1.test.ts # Output filter unit tests
β”‚ β”œβ”€β”€ layer2.test.ts # Semantic compression unit tests
β”‚ └── layer3-dcp.test.ts # DCP pruning + nudge unit tests
β”œβ”€β”€ benchmarks/
β”‚ β”œβ”€β”€ run.ts # Benchmark runner
β”‚ └── fixtures/ # Benchmark test data
β”œβ”€β”€ docs/
β”‚ └── image/
β”‚ └── banner.svg # README banner
β”œβ”€β”€ ultrapress.plugin.jsonc.example # Configuration template (JSONC)
β”œβ”€β”€ tsconfig.json # TypeScript config
β”œβ”€β”€ tsup.config.ts # Build config (tsup)
β”œβ”€β”€ package.json
β”œβ”€β”€ CHANGELOG.md
β”œβ”€β”€ LICENSE
└── README.md
```

### Hook Registration Map

| OpenCode Hook | Trigger | UltraPress Handler | Layer |
| :--- | :--- | :--- | :--- |
| `tool.execute.after` | After CLI tool completes | Output filtering + token tracking + dedup | L1, L4 |
| `chat.message` | Before user message is sent to LLM | Pruning pending blocks + semantic compression + nudge injection | L2, L3 |
| `command.execute.before` | User types `/up` | Slash command handler | β€” |
| `experimental.session.compacting` | OpenCode compacting session | Protected context injection | L4 |
| `config` | Plugin initialization | Register `/up` command | β€” |
| `tool` (definition) | Plugin init | Register `ultrapress_compress` tool | L3 |

### Data Flow Detail

```
1. Plugin Init
config hook β†’ register /up command
tool definition β†’ register ultrapress_compress
load/migrate config from ~/.config/opencode/ultrapress.plugin.json

2. Tool Execution (every tool call)
tool.execute.after β†’ L1 processToolOutput()
→ Domain routing (git→git.ts, test→test.ts, etc.)
β†’ Middle-out truncation
β†’ Deduplication
β†’ L4 applyCleanup()
β†’ Dedup check
β†’ Error registration for purge

3. Chat Message (every user message)
chat.message β†’ L3 check pending compression blocks
β†’ applyPruning() β€” remove old messages, inject summaries
β†’ L2 processMessageContext() β€” semantic compression
β†’ L3 context monitor β€” check token count
β†’ if near limit β†’ inject nudge prompt
β†’ L3 turnTick() β€” update turn counter

4. LLM calls ultrapress_compress
tool.execute.after β†’ compress tool handler
β†’ Create CompressionBlock in compress-state.ts
β†’ Store summary for nesting
β†’ Block will be executed on NEXT chat.message

5. Session Compacting
session.compacting β†’ L4 protected context injection
```

---

## πŸ§ͺ Testing

Run the entire test suite:

```bash
bun test
```

| Test File | Coverage | Layer |
| :--- | :--- | :--- |
| `tests/layer1.test.ts` | Output filtering: domain routing, truncation, tee save, dedup | L1 |
| `tests/layer2.test.ts` | Semantic compression: NLP grammar stripping, code block protection, min length skip | L2 |
| `tests/layer3-dcp.test.ts` | DCP: pruning with preserveLastN, nudge frequency, nesting summaries, protected content | L3 |

```bash
# Run specific layer
bun test tests/layer1.test.ts
bun test tests/layer2.test.ts
bun test tests/layer3-dcp.test.ts

# Run with TypeScript type checking
bun run lint
```

---

## πŸ“Š Benchmark

Run the full benchmark to measure the effectiveness of **all 4 layers**:

```bash
npm run benchmark
```

### Latest Benchmark Results

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚Fixture β”‚Layer β”‚Original β”‚Compressed β”‚Savings β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚git-diff-large.txt β”‚L1 β€” Git Filter β”‚ 1,657β”‚ 969β”‚ 42%β”‚
β”‚npm-install-log.txt β”‚L1 β€” Generic Filter β”‚ 431β”‚ 430β”‚ 0%β”‚
β”‚pytest-log.txt β”‚L1 β€” Generic Filter β”‚ 1,200β”‚ 1,199β”‚ 0%β”‚
β”‚chat-history.json β”‚L2 β€” NLP Semantic β”‚ 625β”‚ 490β”‚ 22%β”‚
β”‚dcp-conversation.json β”‚L3 β€” DCP Pruning (14β†’summary) β”‚ 2,347β”‚ 645β”‚ 73%β”‚
β”‚ β”‚ ↳ 10 msg removed, 1 summary injected β”‚ β”‚ β”‚ β”‚
β”‚3x identical npm test β”‚L4 β€” Tool Call Dedup β”‚ 2,244β”‚ 854β”‚ 62%β”‚
β”‚ β”‚ ↳ 2 duplicates collapsed β”‚ β”‚ β”‚ β”‚
β”‚5 errors Γ— 6 turns β”‚L4 β€” Error Auto-Purge β”‚ 845β”‚ 0β”‚ 100%β”‚
β”‚ β”‚ ↳ 5 errors purged after threshold β”‚ β”‚ β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

βœ… Total: 9,349 β†’ 4,587 tokens (51% overall savings)
```

### Layer Summary

| Layer | Fixture | Avg Savings | Characteristics |
| :--- | :--- | :--- | :--- |
| **L1** Output Filter | 3 fixtures | **21%** | Most effective for verbose CLI logs (`git diff`: 42%). Short output less affected. |
| **L2** Semantic NLP | 1 fixture | **22%** | Consistently compresses natural language without destroying meaning. Code blocks fully protected. |
| **L3** DCP Pruning | 1 fixture | **73%** | Biggest saver β€” removes 10 old messages & replaces with 1 summary. Compounding effect in long sessions. |
| **L4** Auto Cleanup | 2 fixtures | **72%** | Dedup saves 62% from repeated tool calls. Error purge 100% after threshold. |

> πŸ’‘ **Insight**: L3 (DCP) is the layer with the highest savings because it removes old messages in bulk. In long sessions (100+ messages), the cumulative effect of L3 + L4 can reach **70-90% token savings**. Dataset and scripts are in [`benchmarks/`](./benchmarks/) β€” contribute fixtures from your stack for more representative results.

---

## πŸš€ Local Development

```bash
# 1. Clone repository
git clone https://github.com/rahadiana/opencode-ultrapress.git
cd opencode-ultrapress

# 2. Install dependencies
npm install

# 3. Build TypeScript
npm run build # tsup β€” compile to dist/

# 4. Development mode (watch)
npm run dev # tsup --watch

# 5. Run tests
npm test # bun test

# 6. Type checking
npm run lint # tsc --noEmit

# 7. Benchmark
npm run benchmark # tsx benchmarks/run.ts
```

**Development Workflow:**

1. Edit files in `src/`
2. `npm run dev` for auto-rebuild
3. `npm test` to verify
4. Restart OpenCode to reload plugin
5. Test via `/up stats` in chat

---

## ❓ FAQ & Troubleshooting

Plugin doesn't appear after install

1. Ensure the plugin is registered in `~/.config/opencode/config.json`:
```json
{ "plugins": ["@rahadiana/opencode-ultrapress"] }
```
2. Restart OpenCode completely (not just reload window)
3. Check if the package is installed: `npm list -g @rahadiana/opencode-ultrapress`

Error "Cannot find module @huggingface/transformers"

MLM mode requires an additional dependency. Install manually:
```bash
npm install -g @huggingface/transformers
```
Or switch to `"nlp"` mode which does not require external dependencies.

OpenCode feels slow after install

- Check semantic mode: `"mode": "mlm"` β€” MLM model loading at startup can be slow. Switch to `"nlp"` for zero latency.
- Check `notification` level: `"detailed"` prints many logs. Set to `"minimal"`.
- Ensure `minLengthChars` is not too low (default 250 is optimal).

My important messages were deleted by pruning

- Increase `preserveLastN` (default 4 β†’ try 6 or 7)
- Important tool output is automatically protected (`task`, `skill`, `todowrite`, `todoread`, `write`, `edit`)
- Decision markers are also protected (`TODO`, `FIXME`, `ACTION ITEM`, `ROOT CAUSE`, `DECISION`, `BLOCKER`)
- If still deleted, report as a bug with detailed logs

How do I disable a specific layer?

Set `"enabled": false` on the layer you want to turn off:
```json
{
"semantic": { "enabled": false },
"summarization": { "enabled": false }
}
```

TypeScript error during development

Ensure dependencies are installed:
```bash
npm install
npm run lint # tsc --noEmit to check for type errors
```

---

## πŸ—Ί Roadmap

| Feature | Status | Target |
| :--- | :--- | :--- |
| Layer 1: Domain-aware output filtering | βœ… Done | v0.1.0 |
| Layer 2: NLP semantic compression | βœ… Done | v0.1.0 |
| Layer 2: MLM mode | ⚠️ Experimental | v0.2.0 |
| Layer 2: LLM mode (local summarization) | βœ… Done | v0.2.0 |
| Layer 2: All-pairs MLM dedup | βœ… Done | v0.2.0 |
| Layer 3: Block-based DCP pruning | βœ… Done | v0.1.0 |
| Layer 3: `preserveLastN` protection | βœ… Done | v0.1.0 |
| Layer 3: Multi-signal importance scoring | βœ… Done | v0.2.0 |
| Layer 3: Reversible compression (`ultrapress_expand`) | βœ… Done | v0.2.0 |
| Layer 3: Pre-emptive nudge @70% | βœ… Done | v0.2.0 |
| Layer 3: Surgical message pruning | βœ… Done | v0.1.0 |
| Layer 4: Error purging & dedup | βœ… Done | v0.1.0 |
| `/up` slash commands | βœ… Done | v0.1.0 |
| Real token tracking (OpenCode API) | βœ… Done | v0.2.0 |
| Custom filter API | βœ… Done | v0.1.0 |
| TF-IDF scoring (MLM improvement) | 🚧 Planned | v0.2.0 |
| Sentence similarity (MLM improvement) | 🚧 Planned | v0.2.0 |
| Sub-agent (`task`) token tracking & compression | πŸ’‘ Idea | TBD |
| UI stats dashboard in OpenCode | πŸ’‘ Idea | TBD |
| Support more languages (NLP) | πŸ’‘ Idea | TBD |

---

## 🀝 Contributing

Contributions are welcome! Areas most in need of help:

1. **New Filters**: Add Layer 1 filters for new frameworks/stacks (Kubernetes, Docker, Terraform, Svelte, Flutter, etc.).
2. **MLM Roadmap**: Help implement actual TF-IDF scoring or sentence similarity.
3. **Benchmark Dataset**: Contribute fixture data from your tech stack.
4. **Multi-language NLP**: Expand grammar stripping rules for more languages.
5. **Bug Reports**: Report edge cases β€” poorly filtered tool output, important messages deleted, etc.

### Development Setup

```bash
git clone https://github.com/rahadiana/opencode-ultrapress.git
cd opencode-ultrapress
npm install
npm run build
npm test
npm run benchmark
```

### Pull Request Process

1. Fork repository
2. Create feature branch (`git checkout -b feature/amazing-filter`)
3. Commit changes (`git commit -m 'Add amazing filter'`)
4. Push to branch (`git push origin feature/amazing-filter`)
5. Open Pull Request β€” ensure `bun test` and `npm run lint` pass

---

## πŸ“ Changelog

See [CHANGELOG.md](./CHANGELOG.md) for the full version history.

---

## πŸ“„ License

MIT Β© [rahadiana](https://github.com/rahadiana)

---

**UltraPress** β€” *Because tokens are expensive, but context is priceless.* ❀️