{"id":48963825,"url":"https://github.com/paperfoot/writer-cli","last_synced_at":"2026-04-18T03:04:26.830Z","repository":{"id":350157929,"uuid":"1205429358","full_name":"paperfoot/writer-cli","owner":"paperfoot","description":"Fine-tune a local LLM on your own writing. Generate text that sounds like you, not ChatGPT. Rust CLI, agent-friendly.","archived":false,"fork":false,"pushed_at":"2026-04-17T20:43:17.000Z","size":514,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-17T22:29:22.614Z","etag":null,"topics":["agent-friendly-cli","ai","ai-agent","ai-writing","cli","fine-tuning","llama","llm","local-ai","local-first","lora","mlx","personal-ai","privacy-first","rust","rust-cli","stylometry","text-generation","writing","writing-tool"],"latest_commit_sha":null,"homepage":"https://github.com/199-biotechnologies/writer","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/paperfoot.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-04-09T00:41:59.000Z","updated_at":"2026-04-17T20:43:21.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/paperfoot/writer-cli","commit_stats":null,"previous_names":["199-biotechnologies/writer","paperfoot/writer-cli"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/paperfoot/writer-cli","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paperfoot%2Fwriter-cli","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paperfoot%2Fwriter-cli/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paperfoot%2Fwriter-cli/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paperfoot%2Fwriter-cli/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/paperfoot","download_url":"https://codeload.github.com/paperfoot/writer-cli/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paperfoot%2Fwriter-cli/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31954738,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-18T00:39:45.007Z","status":"online","status_checked_at":"2026-04-18T02:00:07.018Z","response_time":103,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent-friendly-cli","ai","ai-agent","ai-writing","cli","fine-tuning","llama","llm","local-ai","local-first","lora","mlx","personal-ai","privacy-first","rust","rust-cli","stylometry","text-generation","writing","writing-tool"],"created_at":"2026-04-18T03:04:25.979Z","updated_at":"2026-04-18T03:04:26.823Z","avatar_url":"https://github.com/paperfoot.png","language":"Rust","readme":"\u003cdiv align=\"center\"\u003e\n\n# writer -- Local AI that writes in your voice\n\n**A CLI that fine-tunes a small local model on your own writing, so the text it generates sounds like you wrote it. Not like ChatGPT. Not like Claude. You.**\n\n\u003cbr /\u003e\n\n[![Star this repo](https://img.shields.io/github/stars/199-biotechnologies/writer?style=for-the-badge\u0026logo=github\u0026label=%E2%AD%90%20Star%20this%20repo\u0026color=yellow)](https://github.com/199-biotechnologies/writer/stargazers)\n\u0026nbsp;\u0026nbsp;\n[![Follow @longevityboris](https://img.shields.io/badge/Follow_%40longevityboris-000000?style=for-the-badge\u0026logo=x\u0026logoColor=white)](https://x.com/longevityboris)\n\n\u003cbr /\u003e\n\n[![Rust](https://img.shields.io/badge/Rust-000000?style=for-the-badge\u0026logo=rust\u0026logoColor=white)](https://www.rust-lang.org/)\n[![MSRV 1.85+](https://img.shields.io/badge/MSRV-1.85%2B-orange?style=for-the-badge)](https://www.rust-lang.org/)\n[![MIT License](https://img.shields.io/badge/License-MIT-blue?style=for-the-badge)](LICENSE)\n[![Local First](https://img.shields.io/badge/Local-First-brightgreen?style=for-the-badge)](#privacy)\n[![PRs Welcome](https://img.shields.io/badge/PRs-Welcome-brightgreen?style=for-the-badge)](CONTRIBUTING.md)\n\n---\n\nEvery AI writing tool sounds like every other AI writing tool. Same em-dashes. Same \"Moreover, it is important to note that.\" Same rule-of-three bullet points. Same sycophantic open, same tidy bow at the end. The reason is simple: they all trained on the same pile of web text, so they all fall into the same attractor. `writer` goes the other way. You hand it a folder of your own writing and it fine-tunes a small model on your voice -- your vocabulary, your rhythm, your typos, your habits of punctuation. The output sounds like you because the model learned from you.\n\nThe model runs locally via [Ollama](https://ollama.com/) on Apple Silicon. Default base model is **Gemma 4 26B-A4B** (3.8B active MoE, 65-75 tok/s on M4 Max). Your writing samples never leave your laptop. No OpenAI. No Anthropic. No \"we use your data to improve our services.\"\n\n[Why this exists](#why-this-exists) | [Install](#install) | [Quick start](#quick-start) | [How it works](#how-it-works) | [Quality stack](#quality-stack) | [Commands](#commands) | [Privacy](#privacy)\n\n\u003c/div\u003e\n\n---\n\n## Why this exists\n\nLLMs made text generation better and language worse. The logic improved. The prose collapsed. Every model trained on the same web-scale corpus converges on the same voice -- the same em-dashes, the same \"Moreover, it is important to note that,\" the same rule-of-three bullet points, the same careful neutrality that offends no one and moves no one. A billion people now write with the same vocabulary, the same rhythm, the same structural habits. Language used to be rich. Now it is uniform.\n\nThe humanisation industry exists because the problem is that obvious. But humanising after the fact is a patch. You fight the model's entire training distribution on every call. And if everyone humanises the same way, the humanised output becomes the new slop -- a second layer of sameness on top of the first.\n\nThe real fix is not to disguise AI text as human. It is to bring back diversity. To capture the fingerprint of a specific voice -- yours, or a writer you admire -- and generate from *that* distribution instead of the generic one. Douglas Adams did not write like Hemingway. Hemingway did not write like Woolf. The richness of language was never about correctness. It was about difference.\n\nThat is what `writer` does. Feed it a corpus of text in a specific voice. It extracts a stylometric fingerprint -- sentence rhythm, vocabulary preferences, punctuation habits, the words the writer reaches for and the words they never use. It fine-tunes a small local model on that fingerprint. From then on, every `writer write \"...\"` call is a sample from *that* distribution. Not generic. Not humanised-generic. Distinctive.\n\nYou can have multiple profiles. One trained on your own blog posts. One trained on Adams. One on your company's house style. Switch between them with `writer profile use \u003cname\u003e`. The model stays small, runs locally, and your writing samples never leave your machine.\n\n## Install\n\n`writer` is a single static Rust binary. Pick one:\n\n```bash\n# From source (works today)\ncargo install --git https://github.com/199-biotechnologies/writer\n\n# From crates.io (package name is `writer-cli` because `writer` is squatted)\ncargo install writer-cli\n\n# Homebrew (planned)\nbrew install 199-biotechnologies/tap/writer\n```\n\nAfter install, the binary on your PATH is `writer`. You also need [Ollama](https://ollama.com/) and Python with [mlx-lm](https://github.com/ml-explore/mlx-lm) (for training and adapter inference on Apple Silicon):\n\n```bash\nbrew install ollama \u0026\u0026 brew services start ollama\nollama pull gemma4:26b  # default base model (17 GB, MoE with 3.8B active params)\npip install mlx-lm     # for LoRA training and adapter-based generation\n```\n\n## Quick start\n\n```bash\n# 1. Create config + data directories and the default voice profile\nwriter init\n\n# 2. Feed it your writing — Obsidian vaults, markdown, plain text\nwriter learn ~/Library/Mobile\\ Documents/iCloud~md~obsidian/Documents/MyVault/\nwriter learn ~/Documents/drafts/*.md\n\n# 3. Inspect the stylometric fingerprint\nwriter profile show\n\n# 4. Fine-tune a LoRA adapter on your samples\nwriter train\n\n# 5. Generate text in your voice\nwriter write \"an essay about why I stopped using Twitter\"\nwriter rewrite draft.md \u003e revised.md\n```\n\n## How it works\n\n```\n┌────────────────────────────────────────────────────────────────────────┐\n│                                                                        │\n│   your writing    stylometric    LoRA       quality        text that   │\n│   samples    --\u003e  fingerprint  + fine   --\u003e decoding   --\u003e sounds      │\n│   (md, txt,       (9 feature    tune       pipeline       like you    │\n│    obsidian)      categories)   (SFT       (logit bias,               │\n│                                 + DPO)     rank-N,                    │\n│                                            contrastive)               │\n│                                                                        │\n└────────────────────────────────────────────────────────────────────────┘\n```\n\n1. **Ingest** -- `writer learn` walks your directories, detects format (Markdown, Obsidian vault, plain text), strips YAML front matter, wikilinks, signatures, tracking params, and normalises the content. Deduplicates by content hash. Chunks long samples respecting paragraph boundaries.\n\n2. **Fingerprint** -- Computes a 9-category stylometric profile grounded in authorship attribution research:\n   - Word/sentence/paragraph length distributions\n   - Function word frequency (200-word list, top signal per PAN shared tasks 2011-2025)\n   - Character n-gram profiles (3-grams, per PAN evaluation data)\n   - Punctuation patterns (em-dashes are the strongest AI tell)\n   - Readability metrics (Flesch-Kincaid, Coleman-Liau, ARI)\n   - Vocabulary richness (Yule's K, hapax legomena, Simpson's D)\n   - AI-slop detection (70 banned words + 30 banned phrases)\n\n3. **Train** -- LoRA fine-tuning via [mlx-tune](https://github.com/ARahim3/mlx-tune) on Apple Silicon. Optional DPO against AI-rewrites of your own samples (teaches the model what NOT to do).\n\n4. **Generate** -- Multi-candidate generation via Ollama, ranked by stylometric distance to your fingerprint. Logit bias suppresses AI-tell words. Post-hoc filter rejects outputs that fail structural checks.\n\n## Quality stack\n\nQuality is opt-out, not opt-in. Every `writer write` runs the full stack:\n\n| Layer | What it does | Reference |\n|---|---|---|\n| **Logit bias** | Suppress AI-tell words, boost user's preferred vocabulary | Writeprints (Abbasi \u0026 Chen, 2008) |\n| **Generate-N-rank** | Generate 8 candidates, rank by stylometric distance | PAN authorship verification |\n| **Post-hoc filter** | Hard-reject outputs with banned words or wrong rhythm | Custom |\n| **Contrastive decoding** | Subtract base model logits from fine-tuned | CoPe (EMNLP 2025) |\n| **DPO training** | Train against AI rewrites of user's own samples | Rafailov et al. (2023) |\n| **Benchmark loop** | VoiceFidelity + SlopScore + CreativeWritingRubric | Custom autoresearch |\n\nStylometric features validated against:\n- PAN Shared Tasks on Authorship Verification (2011-2025) -- function words + char n-grams as top discriminators\n- Writeprints (Abbasi \u0026 Chen, ACM TOIS 2008) -- hapax legomena, POS patterns\n- Yule's K (1944), Simpson's D (1949) -- vocabulary richness measures\n- \"Layered Insights\" (EMNLP 2025) -- hybrid interpretable + neural outperforms either alone\n- LUAR (Rivera-Soto et al., EMNLP 2021) -- neural authorship embeddings (planned)\n\n## Commands\n\n```\nwriter init                          First-time setup\nwriter learn \u003cfiles\u003e                 Ingest writing samples (md, txt, Obsidian vaults)\nwriter profile show                  Show the active profile's fingerprint\nwriter profile list                  List all profiles\nwriter profile new \u003cname\u003e            Create a new profile\nwriter profile use \u003cname\u003e            Switch active profile\nwriter train [--profile \u003cname\u003e]      Fine-tune a LoRA adapter\nwriter write \"\u003cprompt\u003e\"              Generate in the active voice\n  --max-tokens 2048                    Control output length\n  -n 4                                 Generate N candidates, return best\n  -v                                   Include distance/model/timing in JSON\nwriter rewrite \u003cfile\u003e [--in-place]   Rewrite a file in the active voice\nwriter model list                    List available base models\nwriter model pull \u003cname\u003e             Download a base model\nwriter config show                   Show effective configuration\nwriter config path                   Show config file path\nwriter agent-info                    Machine-readable capability manifest\nwriter skill install                 Install SKILL.md to agent platforms\nwriter update [--check]              Self-update from GitHub Releases\n```\n\nEvery command accepts `--json` and `--quiet`. Default `--json` output for `writer write` is just `{\"text\": \"...\"}` — clean for piping and agent consumption. Use `-v` for detailed metrics.\n\n## Results\n\nCase study: Douglas Adams' complete works (684K words, 9 books, Gemma 4 26B + LoRA 500 steps).\n\n| Condition | Stylometric Distance | What it means |\n|-----------|---------------------|---------------|\n| Held-out Adams text | 0.121 | Ceiling — real Adams scored against Adams fingerprint |\n| **writer** (LoRA + ranked) | 0.241 | Best generated output |\n| Base model (no adapter) | 0.306 | Generic Gemma 4 output |\n\nThe system closes 35% of the gap between generic LLM output and the target voice. See the [research paper](docs/paper/draft-v1.md) for full methodology and analysis.\n\n## Agent-friendly by default\n\n`writer` is built on the [agent-cli-framework](https://github.com/199-biotechnologies/agent-cli-framework). AI agents discover it with `writer agent-info`, learn your voice with `writer learn`, and draft in your voice from then on.\n\n## Privacy\n\nEverything stays local. That is the whole point.\n\n| What | Where | Leaves your machine? |\n|---|---|---|\n| Writing samples | `~/Library/Application Support/writer/profiles/\u003cname\u003e/samples/` | No |\n| Stylometric fingerprint | Computed in-process | No |\n| Base model weights | Managed by Ollama | Only during initial download |\n| Fine-tuned adapter | `profiles/\u003cname\u003e/adapters/adapters.safetensors` | No |\n| Generated text | stdout | No |\n| Telemetry | None | -- |\n\nNo API keys. No accounts. No upload step. You can run `writer` on an airplane.\n\n## Configuration\n\n`~/Library/Application Support/writer/config.toml`:\n\n```toml\nactive_profile = \"default\"\nbase_model = \"google/gemma-4-26b\"\n\n[inference]\nbackend = \"ollama\"             # ollama for base, mlx auto-selected when adapter present\ntemperature = 0.7\nmax_tokens = 4096\nollama_url = \"http://localhost:11434\"\n\n[decoding]\nn_candidates = 2               # generate N, return stylometrically closest\nmax_tokens = 4096              # Gemma 4 needs \u003e=4096 (thinking model)\ncontrastive_enabled = true\ncontrastive_alpha = 0.3\nbanned_word_bias = -4.0\n\n[training]\nbackend = \"mlx-tune\"\nrank = 16\nalpha = 32.0\nlearning_rate = 0.00002        # conservative for 26B models\nbatch_size = 1\nmax_steps = 500\nmax_seq_len = 2048\n```\n\nPrecedence: compiled defaults \u003c TOML file \u003c env vars (`WRITER_*`).\n\n## Contributing\n\nPull requests welcome. Read `AGENTS.md` for the rules (it is short). Keep the framework contract, never add interactive prompts, every error needs a `suggestion`, and `cargo test` must pass.\n\n## License\n\nMIT. See [LICENSE](LICENSE).\n\n---\n\n\u003cdiv align=\"center\"\u003e\n\nBuilt by [Boris Djordjevic](https://github.com/longevityboris) at [199 Biotechnologies](https://github.com/199-biotechnologies)\n\n\u003cbr /\u003e\n\n**If this is useful to you:**\n\n[![Star this repo](https://img.shields.io/github/stars/199-biotechnologies/writer?style=for-the-badge\u0026logo=github\u0026label=%E2%AD%90%20Star%20this%20repo\u0026color=yellow)](https://github.com/199-biotechnologies/writer/stargazers)\n\u0026nbsp;\u0026nbsp;\n[![Follow @longevityboris](https://img.shields.io/badge/Follow_%40longevityboris-000000?style=for-the-badge\u0026logo=x\u0026logoColor=white)](https://x.com/longevityboris)\n\n\u003c/div\u003e\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpaperfoot%2Fwriter-cli","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpaperfoot%2Fwriter-cli","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpaperfoot%2Fwriter-cli/lists"}