{"id":50450628,"url":"https://github.com/juninmd/tokenix","last_synced_at":"2026-06-01T00:01:20.222Z","repository":{"id":355739422,"uuid":"1229124312","full_name":"juninmd/tokenix","owner":"juninmd","description":"Local semantic index CLI that reduces LLM token usage 60-90% -- built in Rust, runs 100% offline","archived":false,"fork":false,"pushed_at":"2026-05-31T18:32:16.000Z","size":1761,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-31T19:06:59.111Z","etag":null,"topics":["ai","ai-tools","claude-code","cli","developer-tools","embeddings","ollama","rust","semantic-search","token-optimization"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/juninmd.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":"ROADMAP.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-05-04T18:14:48.000Z","updated_at":"2026-05-31T18:29:19.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/juninmd/tokenix","commit_stats":null,"previous_names":["juninmd/tokenix"],"tags_count":22,"template":false,"template_full_name":null,"purl":"pkg:github/juninmd/tokenix","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/juninmd%2Ftokenix","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/juninmd%2Ftokenix/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/juninmd%2Ftokenix/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/juninmd%2Ftokenix/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/juninmd","download_url":"https://codeload.github.com/juninmd/tokenix/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/juninmd%2Ftokenix/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33753925,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-31T02:00:06.040Z","response_time":95,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","ai-tools","claude-code","cli","developer-tools","embeddings","ollama","rust","semantic-search","token-optimization"],"created_at":"2026-06-01T00:01:06.114Z","updated_at":"2026-06-01T00:01:20.198Z","avatar_url":"https://github.com/juninmd.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"tokenix-logo.png\" alt=\"tokenix logo\" width=\"450\" /\u003e\n\n  \u003ch1\u003etokenix\u003c/h1\u003e\n\n  \u003cp\u003e\u003cstrong\u003eLocal semantic context for AI coding agents, with fewer wasted tokens.\u003c/strong\u003e\u003c/p\u003e\n\n  \u003cp\u003e\n    \u003ca href=\"https://github.com/juninmd/tokenix/releases\"\u003e\u003cimg src=\"https://img.shields.io/github/v/release/juninmd/tokenix?style=flat-square\u0026color=orange\u0026label=release\" alt=\"Latest Release\" /\u003e\u003c/a\u003e\n    \u003ca href=\"https://crates.io/crates/tokenix\"\u003e\u003cimg src=\"https://img.shields.io/crates/v/tokenix?style=flat-square\u0026color=orange\" alt=\"crates.io\" /\u003e\u003c/a\u003e\n    \u003ca href=\"https://github.com/juninmd/tokenix/blob/main/LICENSE\"\u003e\u003cimg src=\"https://img.shields.io/badge/license-MIT-blue?style=flat-square\" alt=\"License\" /\u003e\u003c/a\u003e\n    \u003ca href=\"https://www.rust-lang.org/\"\u003e\u003cimg src=\"https://img.shields.io/badge/built%20with-Rust-orange?style=flat-square\u0026logo=rust\" alt=\"Built with Rust\" /\u003e\u003c/a\u003e\n    \u003cimg src=\"https://img.shields.io/badge/platform-Linux%20%7C%20macOS%20%7C%20Windows-lightgrey?style=flat-square\" alt=\"Platforms\" /\u003e\n    \u003cimg src=\"https://img.shields.io/badge/savings-up%20to%2090%25%20tokens-brightgreen?style=flat-square\" alt=\"Token Savings\" /\u003e\n    \u003cimg src=\"https://img.shields.io/badge/no%20Ollama-required-blue?style=flat-square\" alt=\"No Ollama required\" /\u003e\n  \u003c/p\u003e\n\n  \u003cp\u003e\n    \u003ca href=\"#-quick-install\"\u003eInstall\u003c/a\u003e ·\n    \u003ca href=\"#-how-it-works\"\u003eHow it Works\u003c/a\u003e ·\n    \u003ca href=\"#-benchmark\"\u003eBenchmark\u003c/a\u003e ·\n    \u003ca href=\"#-usage\"\u003eUsage\u003c/a\u003e ·\n    \u003ca href=\"#-setup-by-tool\"\u003eSetup\u003c/a\u003e ·\n    \u003ca href=\"CONTRIBUTING.md\"\u003eContributing\u003c/a\u003e\n  \u003c/p\u003e\n\u003c/div\u003e\n\n---\n\n\u003e **tokenix** is a local-first Rust CLI that helps AI coding agents understand a repository without dumping huge files into the prompt. It indexes your code, finds relevant chunks by meaning, returns compact file outlines, and can hook into AI tools to replace noisy reads and command output with smaller, more useful context. Works with Claude Code, GitHub Copilot, and OpenAI Codex CLI. **No Ollama or external server required.**\n\n```\nWithout tokenix:  Read(src/auth/middleware.rs) → 800 lines → ~2,400 tokens  ❌\nWith tokenix:     tokenix read src/auth/middleware.rs → symbol outline → ~180 tokens  ✅\n```\n\nActual savings depend on codebase size, AI behavior, and file sizes. Run `tokenix gain --history` to see your real numbers.\n\n---\n\n## What Is tokenix?\n\nAI coding agents often waste context on the wrong shape of information: entire files, long grep output, repeated build logs, and directory listings that are much larger than the useful signal inside them. tokenix is a context layer between the agent and your repository.\n\nIt does four jobs:\n\n| Job | What tokenix does | Why it matters |\n|---|---|---|\n| **Index the repository** | Walks source files, splits them into symbol-aware chunks, and stores local embeddings in SQLite | The agent can search by intent instead of opening files blindly |\n| **Read files compactly** | Returns outlines, symbols, or line ranges instead of full files when possible | Large files stop consuming thousands of unnecessary tokens |\n| **Intercept assistant tools** | Hooks into supported tools before large reads and after noisy command output | Optimization happens automatically during normal AI sessions |\n| **Measure savings** | Logs hook decisions and estimates token/cost reduction with `tokenix gain` and `tokenix benchmark` | You can prove whether it is actually helping on your codebase |\n\ntokenix is not a cloud service, not a vector database server, and not a replacement for your AI assistant. It is a local repository index plus a set of CLI and hook integrations that make the assistant's context smaller and more targeted.\n\n---\n\n## ⚡ Quick Install\n\n### Pre-built binary (recommended)\n\nDownload the latest binary for your platform from [GitHub Releases](https://github.com/juninmd/tokenix/releases):\n\n| Platform | File |\n|---|---|\n| Linux x86_64 | `tokenix-linux-x86_64` |\n| Linux arm64 | `tokenix-linux-aarch64` |\n| macOS x86_64 | `tokenix-macos-x86_64` |\n| macOS arm64 (M1/M2/M3) | `tokenix-macos-aarch64` |\n| Windows x86_64 | `tokenix-windows-x86_64.exe` |\n| Windows x86_64 (GPU / DirectML) | `tokenix-windows-x86_64-directml.exe` |\n\n### From crates.io\n\n```bash\ncargo install tokenix --locked\n```\n\n### From source\n\n```bash\ngit clone https://github.com/juninmd/tokenix\ncd tokenix\ncargo install --path . --locked\n```\n\n\u003e **Use `--locked`.** It builds against the committed `Cargo.lock`; without it `cargo install` re-resolves dependencies and can pull an incompatible `ureq` into the `ort-sys` build script.\n\n\u003e **Requirements:** [Rust](https://www.rust-lang.org/tools/install) `\u003e= 1.75` — that's all. No Ollama, no Python, no external services.\n\nThe embedding model (`nomic-embed-text-v1.5-Q`, ~130 MB) is downloaded automatically on first use and cached locally.\n\n---\n\n## ✨ Features\n\n| Feature | Description |\n|---|---|\n| **Semantic search** | Find relevant code by meaning, not just keywords |\n| **One-call MCP context** | `tokenix_context` combines semantic search, entry points, and compact outlines so agents do not burn calls chaining search/read loops |\n| **Graph-aware explore** | `tokenix explore` / `tokenix_explore` returns related symbols, relationship maps, and grouped source in one capped call |\n| **Symbol graph** | `tokenix symbols`, `callers`, `callees`, and `impact` trace relationships between indexed symbols |\n| **Interactive HTML graph** | `tokenix impact --format html` exports a dark-mode vis.js graph with node colours, directional arrows, and physics springs |\n| **Preference memory** | `tokenix memory add/list` stores global and project preferences in editable Markdown; context/explore include saved preferences and capture guidance |\n| **Dynamic language detection** | Map custom file extensions to any built-in parser via a project `.tokenix.toml` — no recompile needed |\n| **Symbol-aware chunking** | AST Tree-sitter parsers for Rust, Python, TypeScript, JavaScript, Go, C++ |\n| **Smart file reader** | Outlines large files; supports `--symbol` and `--lines` reads |\n| **Hook-based interception** | `PreToolUse` intercepts large reads; `PostToolUse` compresses Bash/ListDirectory output |\n| **RTK-grade Compression** | Absorbed RTK features: Fuzzy Grouping (groups `Removing...`, `Compiling...`, etc.), NDJSON/JSON compaction, and ANSI/Emoji stripping |\n| **Local project filters** | Drop `.toml` files in `.tokenix/filters/` for project-scoped compression rules — highest priority over user and bundled filters |\n| **Output filters** | 70+ RTK-compatible TOML filters embedded in the binary — auto-applied to Bash output for `uv`, `cargo`, `terraform`, `ansible`, and more |\n| **Incremental branch indexing** | Branch/HEAD switches with identical code auto-update the git fingerprint without re-indexing |\n| **GPU acceleration (opt-in)** | Build with `--features directml` (Windows) or `--features cuda` to run embeddings on GPU (~10× faster indexing); GPU is used by default with automatic CPU fallback, or force CPU with `--only-cpu` |\n| **Environment diagnostics** | `tokenix doctor` reports the compiled backend, detected GPU, CUDA/cuDNN status, model cache, and daemon — with tailored recommendations |\n| **In-memory daemon** | `tokenix serve` keeps model + index in RAM — warm Grep calls drop from ~430ms to ~80ms |\n| **Graceful fallback** | Always exits `0` on errors — your AI session is never broken |\n| **Token budget** | Results fit within a configurable token budget (default `1200`) |\n| **Savings analytics** | `tokenix gain` — token summary, focused cost table for 7 reference models, by-tool breakdown |\n| **Local-first, no dependencies** | fastembed ONNX in-process — no Ollama, no server, no internet after first run |\n\n---\n\n## 🔌 Supported AI Tools\n\n| Tool | Integration |\n|---|---|\n| [Claude Code](https://docs.anthropic.com/en/docs/claude-code) | `PreToolUse` + `PostToolUse` hooks in `~/.claude/settings.json` |\n| [GitHub Copilot](https://docs.github.com/en/copilot) | `.github/copilot-instructions.md` + `.github/hooks/hooks.json` |\n| [OpenAI Codex CLI](https://help.openai.com/en/articles/11096431-openai-codex-cli-getting-started) | `~/.codex/hooks.json` + Windows wrapper + optional shell helpers |\n\n---\n\n## 🚀 How It Works\n\ntokenix has two modes:\n\n1. **Manual mode**: run `tokenix query` and `tokenix read` directly when you want compact context.\n2. **Hook mode**: install hooks so supported AI tools call tokenix automatically before large reads and after noisy tool output.\n\n### Real-world Compression (RTK Mode)\n\ntokenix now includes advanced output filtering logic inspired by RTK (Rust Token Killer). It doesn't just truncate output; it understands the structure of common CLI tools.\n\n- **Fuzzy Grouping:** Collapses 100s of \"Compiling...\" or \"Removing...\" lines into a single summary line.\n- **Structural Compaction:** Compacts pretty-printed JSON and NDJSON into single-line formats automatically.\n- **Signal Preservation:** Automatically keeps error messages and summaries even when the middle of a log is truncated.\n\n---\n\n## 📊 Benchmark\n\n\u003e Every number below comes from a live benchmark run on the tokenix source, using the actual index, chunking, and query code paths.\n\n### Benchmark Results\n\nWe measure **tokenix** against pure **Vanilla** reads and **RTK** command filtering. `N/A` means the tool does not provide that category of function, not that the measurement failed.\n\n| Metric | **tokenix** | **RTK** | **Vanilla** |\n| :--- | :---: | :---: | :---: |\n| **Large-file read reduction** | **84.8% saved** | N/A | 0% |\n| **Targeted workflow reduction** | **67.2% saved** | N/A | 0% |\n| **Context tokens, avg** | **435** | N/A | 5,050 |\n| **Context homologation** | **4/4** | N/A | **4/4** |\n| **Context latency, avg** | **11ms** | N/A | N/A |\n| **Semantic quality** | **Hit@1 3/4, Hit@3 4/4** | N/A | N/A |\n| **Command compression** | **63.0% saved** | 9.8% saved | 0% |\n| **Command compression vs RTK** | **4/4 equal or lower tokens** | baseline | N/A |\n\n### Capability Matrix\n\nThis table compares what each tool is designed to do. It is intentionally separate from the benchmark table so RTK is not judged as a semantic code search tool, and CodeGraph is not judged as a shell-output compressor.\n\n| Capability | **tokenix** | **RTK** | **CodeGraph** | **Vanilla** |\n| :--- | :---: | :---: | :---: | :---: |\n| **Large read interception** | Yes | No | No | No |\n| **Compact file outlines** | Yes | No | No | No |\n| **Symbol-targeted reads** | Yes | No | Yes | No |\n| **Semantic code search** | Yes | No | Yes | No |\n| **Symbol graph / relationships** | Yes | No | Yes | No |\n| **Shell output filtering** | Yes | Yes | No | No |\n| **RTK-compatible filters** | Yes | Native | No | No |\n| **Claude/Codex/Copilot hooks** | Yes | Yes | Partial | No |\n| **Stale-index fail-open guard** | Yes | N/A | N/A | N/A |\n| **Local embeddings / SQLite** | Yes | N/A | N/A | N/A |\n| **Savings analytics** | Yes | Yes | No | No |\n| **MCP support** | Yes | No | Yes | No |\n\n*Results from `cargo run --release -- benchmark --refresh-index` on May 25, 2026.*\n\n### Methodology\n\n- **Large-file read reduction:** full file tokens vs. large-file outline tokens.\n- **Command output compression:** measures the same synthetic command outputs through tokenix and `rtk pipe`; tokenix must be equal or lower tokens per command to avoid a hidden regression.\n- **Semantic search quality:** Hit@1/Hit@3 accuracy on labeled repository queries.\n- **Context homologation:** validates whether each context arm includes the expected file, not just whether it is small.\n- **CodeGraph comparison:** real CodeGraph context tokens and latency are measured from the local CLI, not estimated from README claims.\n\n### Reproduce it\n\n```bash\ncargo run --release -- benchmark --refresh-index\n```\n\nTo include a local CodeGraph comparison:\n```bash\ncargo run --release -- benchmark --refresh-index --compare-codegraph /path/to/codegraph\n```\n\n\n---\n\n## 🛠 Usage\n\n### 1. Index your repository\n\n```bash\ncd my-project\ntokenix index .\n```\n\n```\ntokenix indexing /home/user/my-project\n  discovered 42 file(s) — chunking\n  embedding 318 chunks via fastembed (ONNX)...\nDone in 42.3s  ·  42 files indexed  ·  318 chunks  ·  87,412 tokens stored\n```\n\n\u003e **First run:** the model (~130 MB) is downloaded automatically. Subsequent runs use the local cache.\n\n### 2. Semantic search\n\n```bash\ntokenix query \"how does JWT validation work\"\ntokenix query \"database connection pooling\" --budget 2000\n```\n\n### 3. One-call task context\n\n```bash\ntokenix context \"fix login refresh token bug\"\ntokenix context \"how does the indexer batch embeddings\" --budget 2000 --max-files 3\ntokenix explore \"run_hook hook_post compression\" --budget 4000 --max-symbols 8\n```\n\n### 4. Smart file reader\n\n```bash\ntokenix read src/auth/middleware.rs                     # symbol outline\ntokenix read src/auth/middleware.rs --symbol validate_token   # targeted\ntokenix read src/auth/middleware.rs --lines 45-80       # line range\n```\n\n### 5. Symbol graph\n\n```bash\ntokenix symbols validate_token\ntokenix callers validate_token\ntokenix callees run_hook\ntokenix impact update_user --depth 2\ntokenix impact update_user --format html                          # dark-mode vis.js graph\ntokenix impact update_user --format html --output update_user.html --depth 3\ntokenix rebuild-graph   # recompute relationships without re-embedding\n```\n\n### 6. Token savings analytics\n\n```bash\ntokenix gain\ntokenix gain --history   # includes last 20 hook events\n```\n\n```\n╭────────────────────────────────────────────────────────────────╮\n│ tokenix gain  ·  my-project                                   │\n╰────────────────────────────────────────────────────────────────╯\n\n  TOKEN SUMMARY                              HOOK CALLS\n  Original (would-be)               332,068    Total                       349\n  After optimization                214,646    Intercepted            148  (42%)\n  Saved                             240,091    Passed through              201\n  Reduction                  72.3%  [█████████████░░░░░]\n\n  COST ESTIMATE  (input tokens · USD)\n    Prices per 1M input tokens from public provider pricing pages. Collected: 2026-05-07.\n\n      Model                          $/1M in       Without          With         Saved\n      ───────────────────────────  ─────────  ────────────  ────────────  ────────────\n      claude-haiku-4-5                 $1.00       $0.3321       $0.2146       $0.1174\n      claude-sonnet-4.6 ★              $3.00       $0.9962       $0.6439       $0.3523\n      claude-opus-4.7                  $5.00       $1.6603       $1.0732       $0.5871\n      gpt-5.4-mini                     $0.75       $0.2491       $0.1610       $0.0881\n      gpt-5.4                          $2.50       $0.8302       $0.5366       $0.2936\n      gemini-3.1-flash-preview         $0.25       $0.0830       $0.0537       $0.0294\n      gemini-3.1-pro-preview           $2.00       $0.6641       $0.4293       $0.2348\n      ★ reference model · prices collected 2026-05-07\n\n  BY TOOL\n  Read    59 calls   228,974 ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓░\n  Grep    87 calls    11,094 ▓░░░░░░░░░░░░░░░░░░░\n  Bash     2 calls        23 ░░░░░░░░░░░░░░░░░░░░\n```\n\nThe cost table intentionally stays small: 7 reference models across Anthropic, OpenAI, and Google. Prices are shown with the collection date so benchmark reports stay auditable.\n\n---\n\n## 🔧 Setup by Tool\n\n### Claude Code\n\n```bash\ntokenix install-hook --tool claude-code\n```\n\nWrites a `PreToolUse` hook to `~/.claude/settings.json` (or `.claude/settings.json` with `--local`). Large reads and semantic greps are intercepted automatically — no changes to your prompts needed.\n\n### GitHub Copilot\n\n```bash\ncd my-project\ntokenix install-hook --tool copilot\ngit add .github/\ngit commit -m \"chore: add tokenix context instructions\"\n```\n\nCreates `.github/copilot-instructions.md` and `.github/hooks/hooks.json`.\n\n### OpenAI Codex CLI\n\n```bash\ntokenix install-hook --tool codex\n# bash / zsh\necho 'source ~/.codex/tokenix-init.sh' \u003e\u003e ~/.bashrc\n# PowerShell\necho '. ~/.codex/tokenix-init.ps1' \u003e\u003e $PROFILE\n```\n\nThen use `tx-read` and `tx-query` as shell helpers.\n\nOn Windows, this also installs `~/.codex/hooks.json` and\n`~/.codex/tokenix-codex-hook.ps1`. The wrapper keeps `PreToolUse` intercepts\nactive, but makes `PostToolUse` fail open so Codex does not report compressed\nBash output as a failed hook.\n\n### All tools at once\n\n```bash\ntokenix install-hook --tool all\n```\n\n---\n\n## 📖 Commands Reference\n\n| Command | Description |\n|---|---|\n| `tokenix index [PATH]` | Index the repo at PATH (default `.`) |\n| `tokenix query TEXT` | Semantic search over indexed chunks |\n| `tokenix context TEXT` | One-call task context: entry points, relevant source, compact outlines |\n| `tokenix explore TEXT` | Graph-aware exploration: entry points, relationships, grouped source |\n| `tokenix memory add TEXT` | Save a project preference for future context |\n| `tokenix memory add --global TEXT` | Save a global preference for future context |\n| `tokenix memory list` | List global and project preferences |\n| `tokenix memory remove QUERY` | Remove matching project preferences |\n| `tokenix memory edit QUERY REPLACEMENT` | Replace matching project preferences |\n| `tokenix read FILE` | Smart reader — outline for large files, full for small |\n| `tokenix symbols QUERY` | Find indexed symbols by name or path |\n| `tokenix callers SYMBOL` | Show symbols that call/reference a symbol |\n| `tokenix callees SYMBOL` | Show symbols called/referenced by a symbol |\n| `tokenix impact SYMBOL` | Show bidirectional impact graph around a symbol |\n| `tokenix impact SYMBOL --format html` | Export interactive vis.js HTML graph (dark mode, physics, colour-coded by kind) |\n| `tokenix impact SYMBOL --format html --output FILE.html` | Save HTML graph to a specific path |\n| `tokenix rebuild-graph` | Rebuild graph tables from existing indexed chunks without re-embedding |\n| `tokenix gain` | Token savings analytics with per-model cost table |\n| `tokenix gain --history` | Same, plus last 20 hook events |\n| `tokenix benchmark` | Reproducible savings and semantic-quality benchmark |\n| `tokenix benchmark --compare-codegraph PATH` | Add a lightweight local CodeGraph comparison section |\n| `tokenix stats` | Index statistics (files, chunks, tokens, age) |\n| `tokenix serve [--port N]` | Start background embedding daemon (keeps model + index in RAM) |\n| `tokenix stop` | Stop the background daemon |\n| `tokenix doctor` | Diagnose embedding backend, GPU availability, model cache, and daemon |\n| `tokenix filter list` | Show top Bash commands by tokens wasted (no filter yet) |\n| `tokenix filter active` | Show active user and bundled output filters |\n| `tokenix filter generate [CMD]` | AI-generate a TOML output filter for a command |\n| `tokenix install-hook` | Install assistant hook/instructions (default `--tool all`) |\n| `tokenix remove-hook` | Remove assistant hook/instructions (default `--tool all`) |\n| `tokenix hook` | `PreToolUse` handler — intercepts large reads (called by AI tools) |\n| `tokenix hook-post` | `PostToolUse` handler — compresses Bash/ListDirectory output (called by AI tools) |\n| `tokenix mcp` | MCP server exposing context, read/search, graph, and gain tools |\n\n\u003cdetails\u003e\n\u003csummary\u003eFlag reference\u003c/summary\u003e\n\n**Global**\n\n| Flag | Default | Description |\n|---|---|---|\n| `--only-cpu` | false | Force CPU embedding even on a GPU-enabled build (no-op on CPU-only builds) |\n\n**`tokenix index`**\n\n| Flag | Default | Description |\n|---|---|---|\n| `--force`, `-f` | false | Reindex all files, ignoring cache |\n| `--cpu-profile` | `default` | Resource profile: `low` (1 worker, tiny batches, pause between batches), `default`, `max` (all cores, large batches) |\n| `--jobs N` | env/default | Set max rayon worker threads for indexing |\n| `--embed-batch N` | 16 (CPU) / 64 (GPU) | Embedding batch size; drives peak memory — lower it if RAM/VRAM is tight |\n| `--if-stale` | false | Skip if index is fresh for the current Git worktree/branch/HEAD |\n\n**`tokenix query`**\n\n| Flag | Default | Description |\n|---|---|---|\n| `--budget`, `-b` | 1200 | Max approximate tokens to return |\n| `--k` | 20 | Candidate chunks before budget filtering |\n| `--file`, `-f` | — | Filter results to a specific file |\n| `--path`, `-p` | `.` | Repository/index path |\n\n**`tokenix benchmark`**\n\n| Flag | Default | Description |\n|---|---|---|\n| `--refresh-index` | false | Refresh index metadata before measuring |\n| `--budget` | 1200 | Semantic query token budget |\n| `--compare-codegraph` | — | Path to a local CodeGraph checkout; prints measured CodeGraph context tokens/latency |\n| `--path`, `-p` | `.` | Repository/index path |\n\n**`tokenix install-hook` / `tokenix remove-hook`**\n\n| Flag | Values | Description |\n|---|---|---|\n| `--tool` | `claude-code`, `copilot`, `codex`, `all` | Target tool (default `all`) |\n| `--local` | — | Claude Code: use `.claude/settings.json` instead of global |\n\n\u003c/details\u003e\n\n---\n\n## 🧠 Supported Languages\n\n| Language | Extensions | Symbol types |\n|---|---|---|\n| Rust | `.rs` | `fn`, `struct`, `enum`, `impl`, `trait`, `mod` |\n| Python | `.py` | `def`, `async def`, `class` |\n| TypeScript | `.ts`, `.tsx` | `function`, `class`, `interface`, `type`, arrow functions |\n| JavaScript | `.js`, `.jsx`, `.mjs`, `.cjs` | `function`, `class`, arrow functions |\n| Go | `.go` | `func`, `type` |\n| C / C++ | `.c`, `.cpp`, `.h`, `.hpp`, `.cc`, `.cxx` | `function`, `class`, `struct`, `namespace` |\n| Config / Docs | `.toml`, `.md`, `.txt`, `.sh`, `.bash` | 400-token line blocks |\n| Data files (opt-in) | `.json`, `.yaml`, `.yml` | Indexed only when `data_files = true` in `.tokenix.toml` |\n| **Custom** | any extension | Mapped to an existing parser via `.tokenix.toml` |\n\nLanguages without a symbol-aware chunker (Java, C#, Ruby, Swift, Kotlin, Scala, …) are not indexed — blind line-block chunking produces low-quality search results and is intentionally excluded.\n\n### Custom language mapping\n\nCreate a `.tokenix.toml` (or `tokenix.toml`) in the project root:\n\n```toml\n[languages]\n# map custom extensions to existing parsers\npyi   = \"python\"    # Python stub files\nmts   = \"typescript\"  # TypeScript module files\nlua   = \"generic\"   # use sliding-window chunks\n```\n\nValid parser values: `rust`, `python`, `typescript`, `javascript`, `go`, `cpp`, `c`, `generic`.\n\n---\n\n## 🔧 Output Filters\n\ntokenix compresses `Bash` and `ListDirectory` output via a `PostToolUse` hook before Claude sees it. Claude uses `hook-post` directly, where `exit 2` means \"replace the tool output with compressed context\". Codex uses a small wrapper that treats post-tool compression as success because Codex reports non-zero post hooks as failures. Filtering happens in three layers (highest priority first):\n\n1. **Local project filters** — drop `.toml` files in `.tokenix/filters/` inside the repository. Scoped to the project, committed to version control, shared with the team.\n2. **User filters** — drop `.toml` files in `~/.tokenix/filters/`. Take priority over bundled filters, apply to all projects.\n3. **Bundled filters** — 70 RTK-compatible TOML filters shipped inside the binary, covering `uv sync`, `cargo build`, `gradle`, `terraform plan`, `make`, `npm`, `poetry`, `docker`, and more. Applied automatically — no setup needed.\n\n### Filter format\n\n```toml\n[filters.uv-sync]\ndescription = \"Compact uv sync output\"\nmatch_command = \"^uv\\\\s+(sync|pip\\\\s+install)\\\\b\"\nstrip_ansi = true\nstrip_lines_matching = [\"^\\\\s*$\", \"^\\\\s+Downloading \", \"^\\\\s+Using cached \"]\nmatch_output = [\n  { pattern = \"Audited \\\\d+ package\", message = \"ok (up to date)\" },\n]\nmax_lines = 20\non_empty = \"uv: ok\"\n```\n\n| Field | Description |\n|---|---|\n| `match_command` | Rust regex matched against the full Bash command line |\n| `strip_ansi` | Remove ANSI colour codes before filtering |\n| `strip_lines_matching` | Drop lines matching any of these regex patterns |\n| `keep_lines_matching` | Keep only lines matching these patterns (signal/noise) |\n| `match_output` | Short-circuit: if output matches `pattern`, return `message` immediately |\n| `max_lines` / `head_lines` / `tail_lines` | Truncate output |\n| `truncate_lines_at` | Truncate individual lines at N characters |\n| `on_empty` | Message to return when filtering produces empty output |\n\n### AI-assisted filter generation\n\n```bash\n# See which commands waste the most tokens (no filter yet)\ntokenix filter list\n\n# Show all active user and bundled RTK-compatible filters\ntokenix filter active\n\n# Generate a TOML filter using a local AI CLI (claude, gh copilot, etc.)\ntokenix filter generate \"cargo test\"\n\n# Save to user filters directory\n# → ~/.tokenix/filters/cargo-test.toml\n```\n\n---\n\n## 🏗 Architecture\n\n```\nsrc/\n├── main.rs        CLI entry (clap), command dispatch, install-hook helpers\n├── chunker.rs     Symbol-aware AST chunking (Tree-sitter) + dynamic language config (.tokenix.toml)\n├── embed.rs       fastembed ONNX: embed_documents(), embed_query() — optional GPU via ort features\n├── store.rs       SQLite schema, CRUD, FTS5, hybrid search, incremental branch fingerprint check\n├── indexer.rs     File walker + incremental index pipeline (parallel chunking + batch embedding)\n├── query.rs       Hybrid semantic + sparse FTS5 ranking, token-budget selection, result formatting\n├── graph.rs       Symbol relationship graph + export_relations_to_html() for vis.js HTML output\n├── hook.rs        PreToolUse handler — Claude-style and Copilot-style JSON input\n├── daemon.rs      Background TCP server — holds model + in-memory embedding cache\n├── compress.rs    PostToolUse compression pipeline (Bash/ListDirectory output)\n├── filters.rs     FilterDef, load_local/user/bundled_filters(), priority merge, apply_filter()\n├── cmd_filter.rs  `tokenix filter` subcommands (list, active, generate)\n└── gain.rs        Analytics from .tokenix/hook.log — per-model cost table\n\nassets/\n└── filters/       70 RTK-compatible TOML filters, embedded in the binary via rust-embed\n```\n\n### GPU Acceleration (opt-in)\n\nA default build runs embeddings on CPU. Compile with a GPU feature to use the GPU — it then becomes the **default at runtime, with automatic CPU fallback** if the provider is unavailable:\n\n```bash\n# Windows — DirectML (works with any D3D12-capable GPU, no CUDA toolkit required)\ncargo install --path . --features directml --locked\n\n# Linux / Windows — CUDA (needs CUDA 12.x + cuDNN 9.x installed and on PATH;\n# ort rc.9 does not support CUDA 13 yet)\ncargo install --path . --features cuda --locked\n```\n\n\u003e **Use `--locked`.** `cargo install` otherwise re-resolves dependencies and can pull an incompatible `ureq` into the `ort-sys` build script. `--locked` builds against the committed `Cargo.lock`.\n\nOn a GPU build, force CPU per-invocation with the global `--only-cpu` flag:\n\n```bash\ntokenix index .              # uses the GPU\ntokenix --only-cpu index .   # forces CPU on a GPU build\n```\n\nRun `tokenix doctor` to see the compiled backend, detected GPU, CUDA/cuDNN status, and tailored recommendations.\n\n\u003e **GPU throughput (measured, RTX 4060 Ti / DirectML):** ~10× faster indexing than CPU (a 10k-chunk repo dropped from ~54 min to ~6 min). The CPU keeps RAM bounded by the embedding batch size — `--embed-batch` defaults to 16 on CPU (~2.8 GB peak) and 64 on GPU.\n\nStorage lives at `~/.tokenix/\u003cproject-id\u003e.db` (global, one DB per project). Embeddings are stored as raw `float32` blobs. Cosine similarity is computed in Rust — no external vector database needed.\n\n### Daemon\n\nThe background daemon (`tokenix serve`) keeps the 130 MB ONNX model and all project embeddings in RAM. Hook calls route over TCP loopback instead of re-loading the model each subprocess invocation:\n\n```\nWithout daemon:  hook process → load model (293 MB) → embed → search SQLite → exit  ~430ms\nWith daemon:     hook process → TCP → daemon (model already loaded) → search RAM →  ~80ms\n```\n\nThe daemon **auto-starts** on the first Grep hook call — you don't need to run it manually. Multiple parallel hook calls share a single model instance, capping RAM at 293 MB regardless of concurrency.\n\n### Embedding model\n\n| Property | Value |\n|---|---|\n| Model | `nomic-embed-text-v1.5` (quantized int8) |\n| Dimensions | 768 |\n| File size | ~130 MB |\n| Cache location | `%LOCALAPPDATA%\\tokenix\\models` (Windows) / `~/.cache/tokenix/models` (Linux/macOS) |\n| Download | Automatic on first run |\n| Runtime | fastembed (ONNX Runtime, in-process) |\n\n---\n\n## 🤝 Contributing\n\nContributions are welcome! See [CONTRIBUTING.md](CONTRIBUTING.md) for how to get started.\n\n---\n\n## 📄 License\n\n[MIT](LICENSE)\n\n\u003c!-- GitHub Topics: rust cli llm token-optimization semantic-search embeddings fastembed onnx claude-code copilot ai-tools code-assistant developer-tools no-ollama --\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjuninmd%2Ftokenix","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjuninmd%2Ftokenix","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjuninmd%2Ftokenix/lists"}