https://github.com/phlx0/tokenmiser
Cut Claude Code token usage 50–80%. Auto-generates a codebase index so Claude finds files instantly instead of reading everything.
https://github.com/phlx0/tokenmiser
ai-tools anthropic claude claude-code cli codebase-index context-window developer-tools productivity tokens
Last synced: 2 months ago
JSON representation
Cut Claude Code token usage 50–80%. Auto-generates a codebase index so Claude finds files instantly instead of reading everything.
- Host: GitHub
- URL: https://github.com/phlx0/tokenmiser
- Owner: phlx0
- License: mit
- Created: 2026-04-08T19:41:48.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2026-04-08T21:52:28.000Z (3 months ago)
- Last Synced: 2026-04-08T23:26:11.713Z (3 months ago)
- Topics: ai-tools, anthropic, claude, claude-code, cli, codebase-index, context-window, developer-tools, productivity, tokens
- Language: TypeScript
- Homepage:
- Size: 91.8 KB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Security: SECURITY.md
Awesome Lists containing this project
README
# tokenmiser
[](https://www.npmjs.com/package/tokenmiser)
[](https://github.com/phlx0/tokenmiser/actions/workflows/ci.yml)
[](https://opensource.org/licenses/MIT)
[](https://nodejs.org)
**Cut Claude Code token usage by 50–80% per session.** Generates a compact codebase index, smart ignore rules, and an optimized `CLAUDE.md` — so Claude navigates straight to the right file instead of reading 10 files to find one function.
```
$ npx tokenmiser init
tokenmiser — slash Claude Code token usage
✔ Scanned 134 files · ~180k tokens (typescript)
→ .claudeignore created
✔ CODEBASE_INDEX.md 112 files indexed · ~1.8k tokens
→ CLAUDE.md updated — index reference added
→ Claude Code hooks configured (.claude/settings.json)
Files in repo 134
Tokens without index ~180k
CODEBASE_INDEX.md ~1.8k tokens
CLAUDE.md overhead ~420 tokens/session
Estimated token savings ~99%
```
---
## Why tokens get wasted
Claude Code burns tokens in three predictable ways:
1. **Blind file reading** — Claude doesn't know what's in a file without opening it, so it reads 5–10 files to find the right one. At ~1,500 tokens per file, that's 7k–15k tokens per lookup.
2. **Irrelevant content** — `node_modules`, build artifacts, lock files, generated code. None of this helps; all of it costs.
3. **Bloated `CLAUDE.md`** — Loaded at session start. A 3,000-token `CLAUDE.md` burns 3,000 tokens before Claude types a single character.
## How tokenmiser fixes it
### 1. `CODEBASE_INDEX.md` — the core fix
A compact map of your entire codebase that Claude reads once instead of reading files blindly:
```markdown
# Codebase Index
> 2026-04-08 · 134 files · ~180k tokens total
> Always read this file first. Navigate to exact files rather than exploring.
## Source
**src/auth/**
- `service.ts` — User authentication · AuthService, login, logout, validateToken, refreshToken
- `middleware.ts` — authenticate, requireRole, optionalAuth
**src/api/**
- `routes.ts` — UserRouter, ProductRouter, OrderRouter
- `controllers/userController.ts` — getUser, createUser, updateUser, deleteUser
```
Instead of reading 10 files to find a function, Claude reads this index (~1,800 tokens) and goes directly to the right file. **One read instead of ten.**
### 2. `.claudeignore` — stop reading junk
Automatically excludes build artifacts, lock files, coverage reports, generated code, binaries, and media files. Language-aware: detects TypeScript, Python, Go, Rust, Java, Ruby, PHP, and adds the right patterns.
### 3. `CLAUDE.md` optimisation
Injects an instruction that tells Claude to always consult `CODEBASE_INDEX.md` before reading any source file. If no `CLAUDE.md` exists, creates a minimal, well-structured template.
### 4. Live index updates via hooks
Installs Claude Code hooks that automatically regenerate `CODEBASE_INDEX.md` after every `Write` or `Edit` call — the index stays current without you doing anything.
A `PreToolUse` hook on `Read` **blocks** full reads of files over ~1,500 lines and tells Claude to use `offset`/`limit` for targeted reads instead.
---
## Real numbers
Index size vs. full repo size, measured on actual projects:
| Project | Language | Files | Full repo | Index only | Index reduction |
|---|---|---|---|---|---|
| Go CLI (ghscope) | Go | 60 | ~40.9k tokens | ~758 tokens | **98%** |
| Next.js website | TypeScript | 37 | ~28.8k tokens | ~541 tokens | **98%** |
| TUI app (snip) | Python | 51 | ~40.5k tokens | ~566 tokens | **99%** |
| tokenmiser itself | TypeScript | 36 | ~22.9k tokens | ~452 tokens | **98%** |
The index reduction is how much smaller `CODEBASE_INDEX.md` is vs. reading every file. **Real per-session savings depend on how many files Claude would have read without the index** — typically 5–10 files per task. In practice that's 50–80% fewer tokens per session, compounding across every task in a conversation.
---
## Install
```bash
# Run directly (no install needed)
npx tokenmiser init
# Or install globally
npm install -g tokenmiser
tokenmiser init
```
**Requirements:** Node.js 18+
---
## Commands
### `tokenmiser init`
One-time setup for a project. Run this first.
```bash
tokenmiser init [options]
Options:
-d, --dir Repository root (default: current directory)
--hooks Install Claude Code hooks for live updates (default)
--no-hooks Skip hook installation
```
Generates:
- `.claudeignore` with language-aware patterns
- `CODEBASE_INDEX.md` with exports and purpose for every source file
- Injects an index reference into `CLAUDE.md` (or creates one)
- Optionally installs hooks in `.claude/settings.json`
### `tokenmiser scan`
Read-only analysis. See where your tokens are going.
```bash
tokenmiser scan [-d ]
```
Shows a token heatmap, directory breakdown, index vs. full-read cost, and CLAUDE.md overhead per session.
### `tokenmiser update`
Regenerates `CODEBASE_INDEX.md` after code changes. Fast — uses a file mtime cache so only changed files are re-processed.
```bash
tokenmiser update [-d ] [--quiet]
```
Wire it to a git hook or npm script:
```json
{ "scripts": { "postbuild": "tokenmiser update" } }
```
---
## What gets installed
```
your-project/
├── CODEBASE_INDEX.md ← commit this
├── .claudeignore ← commit this
├── CLAUDE.md ← updated with index instruction
├── .claude/
│ └── settings.json ← hook config
└── .tokenmiser/
└── cache.json ← add to .gitignore
```
Add `.tokenmiser/` to your `.gitignore`. Commit everything else.
---
## Claude Code hooks
**PostToolUse on Write/Edit** — runs `tokenmiser update --quiet` after every file Claude writes or edits. The index stays current throughout the session automatically.
**PreToolUse on Read** — for files over ~1,500 lines (~21k tokens), blocks the full read and outputs:
```
[tokenmiser] Blocked full read of src/generated/schema.ts (~2,847 lines, ~12k tokens).
Check CODEBASE_INDEX.md for the file map, then read only the section you need using offset/limit.
If you need the full file, re-issue the Read tool call with an explicit offset of 0.
```
Claude can still read large files in sections using `offset`/`limit`, or fully by re-issuing with `offset: 0`.
---
## How the index is built
For each source file, tokenmiser reads the first 32KB and extracts:
- **Purpose** — first JSDoc comment, docstring, or leading comment
- **Exports** — functions, classes, interfaces, types, constants (capped at 10 per file)
Supported languages: TypeScript, JavaScript, Python, Go, Rust, Java, Kotlin, Ruby, PHP, C/C++.
On `tokenmiser update`, a file mtime cache means only changed files are re-processed — fast even on 1,000+ file codebases.
---
## Tips
**Commit `CODEBASE_INDEX.md`** — it doubles as useful documentation and is immediately available to Claude without any scanning.
**Add generated dirs to `.claudeignore`** — if you have a `src/generated/` directory, add it. Generated code is typically the single biggest source of token waste (often 60–80% of all tokens in a project).
**Keep `CLAUDE.md` short** — under 80 lines, under 1,500 tokens. Move detailed docs to separate files that Claude reads on demand.
**Large monorepos** — run `tokenmiser init` in each service directory. Each service gets its own focused index rather than one massive one.
---
## Contributing
See [CONTRIBUTING.md](CONTRIBUTING.md). PRs welcome — especially new language extractors.
## License
MIT © [phlx0](https://github.com/phlx0)