{"id":50884139,"url":"https://github.com/claylo/colophon","last_synced_at":"2026-06-15T15:03:00.267Z","repository":{"id":345635168,"uuid":"1177317881","full_name":"claylo/colophon","owner":"claylo","description":"Because your book deserves a rear end worth looking at. 🍑","archived":false,"fork":false,"pushed_at":"2026-04-03T01:26:38.000Z","size":2158,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-03T11:24:23.232Z","etag":null,"topics":["book-index","glossary-generator","keyword-extraction","publishing","technical-writing","typst"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/claylo.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE-APACHE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null},"funding":{"github":"lovelesslabs","buy_me_a_coffee":"claylo"}},"created_at":"2026-03-09T23:06:38.000Z","updated_at":"2026-04-03T01:26:43.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/claylo/colophon","commit_stats":null,"previous_names":["claylo/colophon"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/claylo/colophon","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/claylo%2Fcolophon","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/claylo%2Fcolophon/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/claylo%2Fcolophon/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/claylo%2Fcolophon/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/claylo","download_url":"https://codeload.github.com/claylo/colophon/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/claylo%2Fcolophon/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34367696,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-15T02:00:07.085Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["book-index","glossary-generator","keyword-extraction","publishing","technical-writing","typst"],"created_at":"2026-06-15T15:02:59.276Z","updated_at":"2026-06-15T15:03:00.257Z","avatar_url":"https://github.com/claylo.png","language":"Rust","funding_links":["https://github.com/sponsors/lovelesslabs","https://buymeacoffee.com/claylo"],"categories":[],"sub_categories":[],"readme":"# colophon\n\n[![CI](https://github.com/claylo/colophon/actions/workflows/ci.yml/badge.svg)](https://github.com/claylo/colophon/actions/workflows/ci.yml)\n[![Crates.io](https://img.shields.io/crates/v/colophon.svg)](https://crates.io/crates/colophon)\n[![docs.rs](https://docs.rs/colophon/badge.svg)](https://docs.rs/colophon)\n[![MSRV](https://img.shields.io/badge/MSRV-1.89.0-blue.svg)](https://github.com/claylo/colophon)\n\nGenerate back-of-book indexes and glossaries from Markdown or Typst source files. Three commands take you from raw manuscript to print-ready index:\n\n```\ncolophon extract --dir ./chapters\ncolophon curate\ncolophon render --terms colophon-terms.yaml --glossary -o ./output\n```\n\n**Extract** scans your documents for keyword candidates using YAKE and TF-IDF. **Curate** sends those candidates to Claude for professional indexing judgment — merging synonyms, building hierarchy, writing definitions. **Render** inserts index markers and generates a glossary in your source format.\n\n## How It Works\n\n```\n  ┌─────────┐      ┌────────┐      ┌────────┐\n  │ extract │ ──── │ curate │ ──── │ render │\n  └─────────┘      └────────┘      └────────┘\n    .md/.typ    candidates.yaml   terms.yaml    annotated files\n    ────────\u003e   ────────────────\u003e ──────────\u003e   + glossary.typ\n```\n\n1. **Extract** walks your source directory, parses Markdown (pulldown-cmark) and Typst (typst-syntax AST), runs YAKE keyword extraction per document, then TF-IDF across the corpus. Outputs `colophon-candidates.yaml`.\n\n2. **Curate** sends candidates to Claude via the `claude` CLI. Claude acts as a professional book indexer: keeps terms worth looking up, kills noise, merges synonyms into canonical entries with aliases, builds parent/child hierarchy, writes glossary definitions, and identifies which files discuss each term substantively. Outputs `colophon-terms.yaml`.\n\n3. **Render** reads the curated terms and inserts format-specific index markers into your source files. For Typst, it generates `#index[term]` and `#index-main[term]` markers (compatible with the [in-dexter](https://typst.app/universe/package/in-dexter/) package) and a `glossary.typ` with linked cross-references.\n\n### Incremental Curate\n\nAfter your first full curate run, subsequent runs are incremental by default. Colophon diffs fresh candidates against your existing `colophon-terms.yaml`:\n\n- **No new terms?** Mechanical location refresh only. Zero cost, sub-second.\n- **New terms found?** Sends only the new candidates to Claude with a compact summary of the existing index. Typical cost: ~$0.50 vs ~$8 for a full rebuild.\n\nForce a full rebuild any time with `--full-rebuild`.\n\n## Installation\n\n### From Source\n\n```bash\ncargo install colophon\n```\n\n### Homebrew (macOS and Linux)\n\n```bash\nbrew install claylo/tap/colophon\n```\n\n### Pre-built Binaries\n\nDownload from the [releases page](https://github.com/claylo/colophon/releases). Binaries are available for macOS (Apple Silicon / Intel), Linux (x86_64 / ARM64), and Windows.\n\n### Requirements\n\n- The `curate` command requires the [Claude CLI](https://claude.com/claude-code) (`claude`) in your PATH.\n- Typst rendering uses [in-dexter](https://typst.app/universe/package/in-dexter/) markers. Add `#import \"@preview/in-dexter:0.6.1\": *` to your document.\n\n## Quick Start\n\nCreate a config file in your project root:\n\n```yaml\n# .colophon.yaml\nsource:\n  dir: ./chapters\n  extensions: [typ]\n```\n\nRun the pipeline:\n\n```bash\n# 1. Extract keyword candidates\ncolophon extract\n\n# 2. Curate with Claude (estimate cost first)\ncolophon curate --dry-run\ncolophon curate\n\n# 3. Render index markers and glossary\ncolophon render --glossary -o ./output\n```\n\nAfter editing your manuscript, run it again. The curate step auto-detects the existing terms file and runs incrementally:\n\n```bash\ncolophon extract\ncolophon curate        # incremental — only new terms sent to Claude\ncolophon render --glossary -o ./output\n```\n\n## Commands\n\n### extract\n\nScan source files and produce keyword candidates:\n\n```bash\ncolophon extract --dir ./chapters -o candidates.yaml\n```\n\n| Flag | Default | Description |\n|------|---------|-------------|\n| `-d, --dir` | config `source.dir` | Source directory to scan |\n| `-o, --output` | `colophon-candidates.yaml` | Output file |\n| `--json` | | Output as JSON instead of YAML |\n\nExtraction parameters (n-gram range, min score, max candidates, stop words, exclude terms, known terms) are configured in your config file. See `config/colophon.yaml.example` for all options.\n\n### curate\n\nSend candidates to Claude for professional indexing:\n\n```bash\n# Estimate cost without invoking Claude\ncolophon curate --dry-run\n\n# Run with Opus for best quality\ncolophon curate -m opus\n\n# Set a budget ceiling\ncolophon curate --max-budget-usd 10\n```\n\n| Flag | Default | Description |\n|------|---------|-------------|\n| `--candidates` | `colophon-candidates.yaml` | Path to candidates file |\n| `-m, --model` | config `curate.model` | Claude model (`sonnet`, `opus`, `haiku`) |\n| `--dry-run` | | Estimate cost and exit |\n| `--max-budget-usd` | | Abort if estimated cost exceeds this |\n| `--full-rebuild` | | Force full rebuild (skip incremental) |\n| `--full` | | Send full YAML with context snippets |\n| `-o, --output-dir` | `.` | Where to write `colophon-terms.yaml` |\n\n**Incremental mode** activates automatically when `colophon-terms.yaml` exists in the output directory. It re-extracts candidates locally, diffs against existing terms, and sends only new candidates to Claude. Use `--full-rebuild` to bypass this.\n\nPass extra flags to the `claude` CLI after `--`:\n\n```bash\ncolophon curate -- --max-turns 3\n```\n\n### render\n\nInsert index markers and generate a glossary:\n\n```bash\n# Annotate source files with index markers\ncolophon render --terms colophon-terms.yaml -o ./output\n\n# Also generate a glossary\ncolophon render --terms colophon-terms.yaml --glossary -o ./output\n\n# Only mark substantive discussions (bold page numbers)\ncolophon render --terms colophon-terms.yaml --main-only -o ./output\n```\n\n| Flag | Default | Description |\n|------|---------|-------------|\n| `--terms` | `colophon-terms.yaml` | Curated terms file |\n| `-d, --dir` | from terms file | Source directory |\n| `-o, --output-dir` | `.` | Where to write annotated files |\n| `--format` | `typst` | Output format |\n| `--glossary` | | Also emit `glossary.typ` |\n| `--main-only` | | Only mark `main: true` locations |\n| `--glossary-spacing` | | Gap between entries (e.g., `12pt`) |\n\nFor Typst output, render produces:\n- `#index[term]` markers at term locations\n- `#index-main[term]` for substantive discussions\n- `#index(\"parent\", \"child\")` for hierarchical entries\n- A `glossary.typ` with code-mode `#terms()` entries, label anchors, and linked cross-references\n\n## Configuration\n\nConfig files are discovered automatically (highest precedence first):\n\n1. `.colophon.\u003cext\u003e` in current directory or any parent\n2. `colophon.\u003cext\u003e` in current directory or any parent\n3. `~/.config/colophon/config.\u003cext\u003e` (user config)\n\nSupported formats: TOML, YAML, JSON. Search stops at `.git` boundaries.\n\nSee [`config/colophon.yaml.example`](config/colophon.yaml.example) for all available options with documentation.\n\n### Minimal Config\n\n```yaml\nsource:\n  dir: ./chapters\n  extensions: [typ]\n\ncurate:\n  model: opus\n  max_budget_usd: 10.0\n```\n\n## Global Options\n\nEvery command accepts these flags:\n\n| Flag | Description |\n|------|-------------|\n| `-v, --verbose` | More detail (repeatable: `-vv` for trace) |\n| `-q, --quiet` | Only print errors |\n| `--json` | Machine-readable JSON output |\n| `-C, --chdir` | Run as if started in a different directory |\n| `-c, --config` | Explicit config file path |\n\n## Development\n\n```\ncrates/\n  colophon/         # CLI binary (thin shell)\n  colophon-core/    # Shared library (extract, curate, render)\n  xtask/            # Dev automation\n```\n\nPrerequisites: Rust 1.89.0+, [just](https://github.com/casey/just), [cargo-nextest](https://nexte.st/)\n\n```bash\njust check          # fmt + clippy + deny + test + doc-test\njust test           # cargo nextest run\njust clippy         # lint\n```\n\nSee [AGENTS.md](AGENTS.md) for development conventions.\n\n## License\n\nLicensed under either of:\n\n- Apache License, Version 2.0 ([LICENSE-APACHE](LICENSE-APACHE))\n- MIT license ([LICENSE-MIT](LICENSE-MIT))\n\nat your option.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fclaylo%2Fcolophon","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fclaylo%2Fcolophon","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fclaylo%2Fcolophon/lists"}