https://github.com/effortlessmetrics/tokmd
Code intelligence for humans, machines, and LLMs: receipts, metrics, and insights from your codebase.
https://github.com/effortlessmetrics/tokmd
ai-tools ci cli code-metrics code-statistics csv devex github-actions jsonl llm llmops loc markdown monorepo repository-analysis rust sloc tokei tsv
Last synced: 24 days ago
JSON representation
Code intelligence for humans, machines, and LLMs: receipts, metrics, and insights from your codebase.
- Host: GitHub
- URL: https://github.com/effortlessmetrics/tokmd
- Owner: EffortlessMetrics
- License: apache-2.0
- Created: 2026-01-25T12:20:09.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2026-02-14T14:12:28.000Z (27 days ago)
- Last Synced: 2026-02-14T19:39:15.332Z (27 days ago)
- Topics: ai-tools, ci, cli, code-metrics, code-statistics, csv, devex, github-actions, jsonl, llm, llmops, loc, markdown, monorepo, repository-analysis, rust, sloc, tokei, tsv
- Language: Rust
- Homepage: https://www.tokmd.org/
- Size: 2.51 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 80
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE-APACHE
- Code of conduct: CODE_OF_CONDUCT.md
- Citation: CITATION.cff
- Codeowners: .github/CODEOWNERS
- Security: SECURITY.md
- Roadmap: ROADMAP.md
- Agents: AGENTS.md
Awesome Lists containing this project
README
# tokmd
> **Code intelligence for humans, machines, and LLMs: receipts, metrics, and insights from your codebase.**
[](https://crates.io/crates/tokmd)
[](https://crates.io/crates/tokmd)
[](https://crates.io/crates/tokmd)
[](https://docs.rs/tokmd)
[](https://github.com/EffortlessMetrics/tokmd/actions/workflows/ci.yml)
[](https://github.com/marketplace/actions/tokmd-receipt)
## The Pain
Repo stats are easy to compute (cloc, tokei) but annoying to **use**.
- Shell scripts to pipe `tokei` output are fragile.
- Different OSs (Windows/Linux) produce slightly different outputs.
- Pasting raw line counts into LLMs (ChatGPT/Claude) is messy and unstructured.
- Checking "did this PR bloat the codebase?" requires deterministic diffs.
- Raw counts don't tell you where the risk is or what needs attention.
## The Product
`tokmd` is a lightweight code intelligence tool built on the excellent [`tokei`](https://github.com/XAMPPRocky/tokei) library. It produces **receipts** (deterministic artifacts) and **analysis** (derived insights).
- **Receipts**: Schema'd, normalized artifacts that represent the "shape" of your code.
- **Analysis**: Derived metrics like doc density, test coverage, hotspots, and effort estimation.
- **Outputs**: Markdown summaries, JSON/JSONL/CSV datasets, SVG badges, Mermaid diagrams.
## Quickstart
```bash
# Install (recommended)
nix profile install github:EffortlessMetrics/tokmd
# Or install with Cargo
cargo install tokmd
# 1. Markdown summary (great for PR descriptions)
tokmd > summary.md
# 2. Module breakdown (great for monorepos)
tokmd module --module-roots crates,packages
# 3. Pack for LLM context (smart selection)
tokmd context --budget 128k --output bundle --output context.txt
# 4. Analysis report (derived metrics)
tokmd analyze --preset receipt --format md
# 5. Git risk analysis (hotspots, freshness)
tokmd analyze --preset risk --format md
# 6. Generate a badge
tokmd badge --metric lines --output badge.svg
# 7. Diff two states
tokmd diff main HEAD
```
## Use Cases
### LLM Context Packing
Pack files into an LLM context window with budget-aware selection:
```bash
tokmd context --budget 128k --output bundle --output context.txt # Ready to paste
tokmd context --budget 200k --strategy spread # Coverage across modules
tokmd context --budget 100k --output bundle --compress --output context.txt # Strip blank lines for density
```
### LLM Context Planning
Smartly select files to fit your context window:
```bash
# Pack top files by code volume into 128k tokens
tokmd context --budget 128k --output bundle --output context.txt
# Or generate a manifest to see what fits
tokmd context --budget 128k --output list
```
### PR Summaries
Add a `tokmd` summary to show reviewers the repo shape:
```bash
tokmd --format md --top 5
```
### CI Artifacts & Diffs
Track changes over time:
```bash
tokmd run --output-dir .runs/$(git rev-parse --short HEAD)
tokmd diff .runs/baseline .runs/current
```
### Health Checks
Quick code quality signals:
```bash
tokmd analyze --preset health # Doc density, TODO count, test ratio
tokmd analyze --preset risk # Hotspots, coupling, freshness
```
## Commands
| Command | Purpose |
| :------------------- | :-------------------------------------------------------------- |
| `tokmd` | Language summary (lines, files, bytes). |
| `tokmd module` | Group stats by directories (`crates/`, `src/`). |
| `tokmd context` | Pack files into an LLM context window. |
| `tokmd export` | File-level dataset (JSONL/CSV/CycloneDX) for downstream tools. |
| `tokmd run` | Full scan with artifact output to a directory. |
| `tokmd analyze` | Derived metrics and enrichments. |
| `tokmd badge` | SVG badge for a metric (lines, tokens, doc%). |
| `tokmd diff` | Compare two runs, receipts, or git refs. |
| `tokmd cockpit` | PR metrics for code review (evidence gates, risk, review plan). |
| `tokmd sensor` | Conforming sensor producing `sensor.report.v1` envelope. |
| `tokmd gate` | Policy-based quality gates with JSON pointer rules. |
| `tokmd baseline` | Capture complexity baseline for trend tracking. |
| `tokmd handoff` | Bundle codebase for LLM handoff with intelligence presets. |
| `tokmd tools` | Generate LLM tool definitions (OpenAI, Anthropic, JSON Schema). |
| `tokmd init` | Generate a `.tokeignore` file (supports templates). |
| `tokmd check-ignore` | Explain why files are being ignored (troubleshooting). |
| `tokmd completions` | Generate shell completions (bash, zsh, fish, powershell). |
## Analysis Presets
`tokmd analyze` provides focused analysis bundles:
| Preset | What You Get |
| :------------- | :-------------------------------------------------------------------------- |
| `receipt` | Totals, doc density, test density, distribution, COCOMO, context window fit |
| `health` | + TODO/FIXME density, complexity, Halstead metrics |
| `risk` | + Git hotspots, coupling, freshness, complexity, Halstead metrics |
| `supply` | + Asset inventory, dependency lockfile summary |
| `architecture` | + Import/dependency graph |
| `topics` | Semantic topic clouds from path analysis |
| `security` | License detection, high-entropy file scanning |
| `identity` | Project archetype, corporate fingerprint |
| `git` | Predictive churn, trend analysis |
| `deep` | Everything (except fun) |
| `fun` | Eco-label, novelty outputs |
## Key Features
### Deterministic Output
Same input always produces same output. Essential for diffs and CI.
### Schema Versioning
All JSON outputs include `schema_version`. Breaking changes increment the version.
### Token Estimation
Every file row includes estimated tokens for LLM context planning.
### Redaction
Share receipts safely without leaking internal paths:
```bash
tokmd export --redact all # Hash paths and module names
```
### Context Window Analysis
Check if your codebase fits in an LLM's context:
```bash
tokmd analyze --window 128000 # Approximately 128k tokens
```
### Git Integration
Analyze git history for risk signals:
- **Hotspots**: Files with high churn AND high complexity
- **Freshness**: Stale modules that may need attention
- **Coupling**: Files that always change together
- **Bus Factor**: Modules with single-author risk
## Why tokmd over tokei?
| Feature | `tokei` | `tokmd` |
| :------------------ | :----------------------- | :--------------------------------------- |
| **Core Value** | Fast, accurate counting | Counting + Analysis + Workflow |
| **Output** | Terminal tables | Receipts (Md/JSON/CSV), Badges, Diagrams |
| **Stability** | Output varies by version | Strict schema versioning |
| **LLM Ready** | No | Token estimates, context fit analysis |
| **Git Analysis** | No | Hotspots, freshness, coupling |
| **Derived Metrics** | No | Doc density, COCOMO, distribution |
## Configuration
You can persist settings in a `tokmd.toml` file in your project root or home directory.
```toml
# tokmd.toml
[view.llm]
format = "jsonl"
redact = "paths"
min_code = 10
max_rows = 500
[export]
format = "jsonl"
```
Use profiles with the `--profile` flag:
```bash
tokmd export --profile llm > context.jsonl
```
See the [CLI Reference](docs/reference-cli.md#configuration-file) for full configuration options.
## Installation
### Nix (recommended)
```bash
# Run without installing
nix run github:EffortlessMetrics/tokmd -- --version
# Install to your profile
nix profile install github:EffortlessMetrics/tokmd
# Build from source
nix build
```
### Homebrew (macOS/Linux)
```bash
brew tap EffortlessMetrics/tap
brew install tokmd
```
### Arch Linux (AUR)
```bash
# With an AUR helper like yay or paru
yay -S tokmd
# Or manually
git clone https://aur.archlinux.org/tokmd.git
cd tokmd
makepkg -si
```
### Scoop (Windows)
```bash
scoop bucket add effortlessmetrics https://github.com/EffortlessMetrics/scoop-bucket
scoop install tokmd
```
### WinGet (Windows)
```bash
winget install EffortlessMetrics.tokmd
```
### From crates.io
```bash
cargo install tokmd
```
### GitHub Action
```yaml
- uses: EffortlessMetrics/tokmd@v1
with:
paths: "."
```
### Language Bindings (Build from Source)
Native FFI bindings for CI pipelines and tooling (not yet published to package registries):
- **Python**: `cd crates/tokmd-python && maturin develop` — `tokmd.lang()`, `tokmd.analyze()`, etc.
- **Node.js**: `cd crates/tokmd-node && npm run build` — async API with Promises
## Documentation
- [Tutorial](docs/tutorial.md) — Getting started guide
- [Recipes](docs/recipes.md) — Real-world usage examples
- [CLI Reference](docs/reference-cli.md) — All flags and options
- [Schema](docs/SCHEMA.md) — Receipt format documentation
- [Troubleshooting](docs/troubleshooting.md) — Common issues and solutions
- [Philosophy](docs/explanation.md) — Design principles
- [tokmd responsibilities](tokmd-role.md) - tokmd's position in the sensors -> receipts -> cockpit stack
- [Contributing](CONTRIBUTING.md) — Development setup and guidelines
- [Roadmap](ROADMAP.md) — Project status and future plans
## License
MIT or Apache-2.0.