{"id":50335681,"url":"https://github.com/mwmdev/dictr","last_synced_at":"2026-06-10T23:00:45.402Z","repository":{"id":338881378,"uuid":"1158761981","full_name":"mwmdev/dictr","owner":"mwmdev","description":"Push-to-talk voice dictation for Linux / X11","archived":false,"fork":false,"pushed_at":"2026-06-10T21:11:05.000Z","size":157,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-10T22:09:00.472Z","etag":null,"topics":["cli","dictation","dictation-tool","linux","local","push-to-talk","rust","speech-to-text","stt","whisper","x11"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mwmdev.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE-APACHE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-15T21:50:07.000Z","updated_at":"2026-06-10T21:11:09.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/mwmdev/dictr","commit_stats":null,"previous_names":["mwmdev/dictr"],"tags_count":4,"template":false,"template_full_name":null,"purl":"pkg:github/mwmdev/dictr","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mwmdev%2Fdictr","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mwmdev%2Fdictr/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mwmdev%2Fdictr/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mwmdev%2Fdictr/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mwmdev","download_url":"https://codeload.github.com/mwmdev/dictr/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mwmdev%2Fdictr/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34174148,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-10T02:00:07.152Z","response_time":89,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cli","dictation","dictation-tool","linux","local","push-to-talk","rust","speech-to-text","stt","whisper","x11"],"created_at":"2026-05-29T13:30:38.031Z","updated_at":"2026-06-10T23:00:45.392Z","avatar_url":"https://github.com/mwmdev.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# dictr\n\n[![CI](https://github.com/mwmdev/dictr/actions/workflows/ci.yml/badge.svg)](https://github.com/mwmdev/dictr/actions/workflows/ci.yml)\n[![Crates.io](https://img.shields.io/crates/v/dictr)](https://crates.io/crates/dictr)\n[![License](https://img.shields.io/crates/l/dictr)](LICENSE-MIT)\n\nPush-to-talk voice dictation for Linux.\n\nSingle binary - Private - Fast - Customizable\n\n## Features\n- **Push-to-talk** — hold a hotkey to record, release to transcribe and paste\n- **Local inference** — runs [Whisper](https://github.com/ggerganov/whisper.cpp) locally, your audio never leaves your machine\n- **CUDA GPU acceleration** — optional NVIDIA GPU support for sub-second transcription\n- **OpenAI API fallback** — use the OpenAI Whisper API as an alternative backend\n- **Text replacements** — custom post-processing rules for text replacement\n- **File transcription** — transcribe audio files directly via `--file` (any format ffmpeg supports)\n\n## Usage\n\n```sh\ndictr                          # Default: AltGr hotkey, local whisper, safe clipboard paste\ndictr --hotkey F9              # Use F9 instead of AltGr\ndictr --backend api            # Use OpenAI Whisper API (requires OPENAI_API_KEY)\ndictr --api-url http://...     # Custom API endpoint\ndictr --model /path/to/model   # Specific model file\ndictr --paste                  # Force clipboard paste output\ndictr --type                   # Use xdotool typing instead of paste\ndictr --device AT2020          # Select mic by name substring\ndictr --list-devices           # List available input devices\ndictr --language fr            # Transcribe in French\ndictr --initial-prompt '...'   # Guide transcription with context\ndictr --min-duration 500       # Min recording duration in ms (default: 300)\ndictr --file recording.ogg     # Transcribe an audio file (requires ffmpeg)\ndictr --verbose                # Debug output\n```\n\n## Install\n\n### Interactive installer\n\n```sh\ncurl -fsSL https://raw.githubusercontent.com/mwmdev/dictr/main/install.sh | sh\n```\n\n### Cargo\n\n```sh\ncargo install dictr\n```\n\nThen download a [Whisper model](https://huggingface.co/ggerganov/whisper.cpp/tree/main) to `~/.local/share/dictr/models/`.\n\n### NixOS\n\nFrom a source checkout:\n\n```sh\nnix profile add .#dictr-cpu   # CPU build\nnvidia-smi --query-gpu=compute_cap --format=csv,noheader\nnix profile add .#dictr-cuda-sm120  # CUDA build for compute capability 12.0\n```\n\nCUDA packages are named `dictr-cuda-smXX`, where `XX` is the compute capability\nwithout the dot. For example, compute capability `8.9` uses `dictr-cuda-sm89`.\nRun `nix flake show` to list supported CUDA targets. The CUDA package keeps the\nCUDA runtime libraries in the Nix closure, so it does not need `LD_LIBRARY_PATH`\nwrappers.\n\n### Build from source\n\nRequires Linux with X11, `xdotool`, `xclip`, ALSA or PipeWire. Optional: `ffmpeg` (for `--file`). Build deps: `cmake`, `clang`, `pkg-config`, `libasound2-dev`, `libx11-dev`, `libxi-dev`, `libxtst-dev`, `libxrandr-dev`, `libssl-dev`. For CUDA: NVIDIA CUDA toolkit.\n\n```sh\ncargo build --release                  # CPU only\ncargo build --release --features cuda  # With GPU\n```\n\nOn NixOS, use `nix-shell --run \"cargo build --release\"` for CPU builds, or\n`nix-shell --argstr cudaArch 120 --run \"cargo build --release --features cuda\"`\nfor CUDA builds.\n\n## Configuration\n\n`~/.config/dictr/config.toml`:\n\n```toml\nhotkey = \"AltGr\"                 # Supported hotkeys: AltGr, Alt, Ctrl, RCtrl, Shift, RShift, Super, CapsLock, Space, Escape, F1-F12\nbackend = \"local\"                # \"local\" or \"api\"\nmodel_path = \"~/.local/share/dictr/models/ggml-base.bin\"\napi_key = \"\"                     # or set OPENAI_API_KEY env var\napi_url = \"https://api.openai.com/v1/audio/transcriptions\"\noutput_mode = \"paste\"            # \"paste\" or \"type\"; paste is layout-safe\ntyping_delay_ms = 2\nmin_duration_ms = 300\ndevice = \"AT2020USB+\"\nlanguage = \"en\"\ninitial_prompt = \"commit, readme, build, test, deploy, refactor\" # Guide transcription with context (e.g. expected words, domain-specific terms)\n\n[replacements]\n\"slash \" = \"/\"\n\"new line\" = \"\\n\"\n```\n\n### Text replacements\n\nThe `[replacements]` table performs substitution on transcription output. Useful for special cases like \"slash\" → \"/\" or \"new line\" → \"\\n\". Keys are replaced with their corresponding values in the final transcribed text. \n\n### Text insertion\n\nThe default `output_mode = \"paste\"` inserts text through the clipboard and then\nrestores the previous clipboard contents. This avoids keyboard layout issues\nwith simulated typing, such as QWERTY/AZERTY `a`/`q` and `w`/`z` swaps. Use\n`--type` or `output_mode = \"type\"` only if you specifically need the older\n`xdotool type` behavior.\n\n## License\n\nLicensed under either of [MIT](LICENSE-MIT) or [Apache-2.0](LICENSE-APACHE) at your option.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmwmdev%2Fdictr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmwmdev%2Fdictr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmwmdev%2Fdictr/lists"}