https://github.com/jguida941/voiceterm
Voice-first HUD overlay for AI CLIs (Codex + Claude) with local Whisper STT, PTY passthrough, and customizable terminal UI.
https://github.com/jguida941/voiceterm
ai-cli claude codex developer-tools hands-free linux macos openai-whisper productivity ratatui rust speech-to-text stt terminal tui voice voice-control voice-input voiceterm whisper
Last synced: 3 months ago
JSON representation
Voice-first HUD overlay for AI CLIs (Codex + Claude) with local Whisper STT, PTY passthrough, and customizable terminal UI.
- Host: GitHub
- URL: https://github.com/jguida941/voiceterm
- Owner: jguida941
- License: mit
- Created: 2025-11-06T05:21:21.000Z (7 months ago)
- Default Branch: master
- Last Pushed: 2026-02-22T14:34:42.000Z (3 months ago)
- Last Synced: 2026-02-22T19:21:42.286Z (3 months ago)
- Topics: ai-cli, claude, codex, developer-tools, hands-free, linux, macos, openai-whisper, productivity, ratatui, rust, speech-to-text, stt, terminal, tui, voice, voice-control, voice-input, voiceterm, whisper
- Language: Rust
- Homepage:
- Size: 9.92 MB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- Contributing: .github/CONTRIBUTING.md
- License: LICENSE
- Code of conduct: .github/CODE_OF_CONDUCT.md
- Codeowners: .github/CODEOWNERS
- Security: .github/SECURITY.md
- Agents: AGENTS.md
Awesome Lists containing this project
README
VoiceTerm is a voice-first terminal overlay for Codex, Claude, and Gemini.
It runs Whisper on your machine and types what you say into your existing CLI.
Your tools still run in a normal PTY; VoiceTerm just adds a HUD on top.
Use push-to-talk or wake phrases (`hey codex`, `hey claude`), then say
`send` / `submit` for hands-free delivery.
Whisper runs locally by default. No cloud API keys required.
Release history: [dev/CHANGELOG.md](dev/CHANGELOG.md).
## Quick Nav
- [Hands-Free Quick Start](#hands-free-quick-start)
- [Install and Start](#install-and-start)
- [Requirements](#requirements)
- [Features](#features)
- [Supported Backends](#supported-ai-clis)
- [Controls](#controls)
- [Guides Index](guides/README.md)
- [Documentation](#documentation)
- [Support](#support)
## Install and Start
Install one supported AI CLI first:
**Codex:**
```bash
npm install -g @openai/codex
```
**Claude Code:**
```bash
curl -fsSL https://claude.ai/install.sh | bash
```
Then choose one VoiceTerm setup path:
Homebrew (recommended)
```bash
brew tap jguida941/voiceterm
brew install voiceterm
cd ~/your-project
voiceterm
```
If needed, authenticate once:
```bash
voiceterm --login --codex
voiceterm --login --claude
```
PyPI (pipx / pip)
```bash
pipx install voiceterm
# or: python3 -m pip install --user voiceterm
cd ~/your-project
voiceterm
```
If needed, authenticate once:
```bash
voiceterm --login --codex
voiceterm --login --claude
```
From source
Requires Rust toolchain. See [Install Guide](guides/INSTALL.md) for details.
```bash
git clone https://github.com/jguida941/voiceterm.git
cd voiceterm
./scripts/install.sh
```
If you are running from source while developing, run:
```bash
python3 dev/scripts/devctl.py check --profile ci
```
macOS App
Double-click `app/macos/VoiceTerm.app`, pick a folder, and it opens Terminal
with VoiceTerm running.
For model options and startup/IDE tuning:
- [Install Guide](guides/INSTALL.md)
- [Whisper docs](guides/WHISPER.md)
- [Troubleshooting](guides/TROUBLESHOOTING.md)
## How It Works
VoiceTerm listens to your mic, converts speech to text on your machine, and
types the result into your AI CLI input.

## Requirements
- macOS or Linux (Windows needs WSL2)
- Microphone access
- ~0.5 GB disk for the default small model (base is ~142 MB, medium is ~1.5 GB)
## Features
### Main features
| Feature | What it does |
|---------|---------------|
| **Local speech-to-text** | Whisper runs on your machine (no cloud calls) |
| **Fast voice-to-text** | Local Whisper turns speech into text quickly |
| **Keep your CLI as-is** | Your backend CLI layout and behavior stay the same |
| **Auto voice mode** | Keep listening on so you can talk instead of typing |
| **Wake mode + voice send** | Say `hey codex`/`hey claude`, then say `send`/`submit` in insert mode |
| **Image prompts** | Use `Ctrl+X` for one-shot screenshot prompts, or enable persistent image mode for HUD `[rec]` (`IMG` badge) |
| **Transcript queue** | If the CLI is busy, VoiceTerm waits and sends text when ready |
| **Codex + Claude support** | Primary support for Codex and Claude Code |
### Everyday tools
- **Voice macros**: expand phrases from `.voiceterm/macros.yaml` (toggle in Settings)
- **Voice navigation**: spoken `scroll`, `send`, `show last error`, `copy last error`, `explain last error`
- **Dev mode tools**: use `--dev` (`DEV` badge), `Ctrl+D` for Dev panel tools, `--dev-log` for JSONL diagnostics
- **Prompt-safe HUD**: VoiceTerm hides HUD rows during Codex/Claude approval and reply/composer prompts so text stays readable
- **Transcript history**: `Ctrl+H` to search and replay past text
- **Notification history**: `Ctrl+N` to review recent status messages
- **Saved settings**: stored in `~/.config/voiceterm/config.toml`
- **Built-in themes**: 11 themes including ChatGPT, Catppuccin, Dracula, Nord, Tokyo Night, and Gruvbox
- **Style-pack border settings**: `VOICETERM_STYLE_PACK_JSON` supports `components.overlay_border` and `components.hud_border` (HUD applies when border mode is `theme`)
For full behavior details and controls, see [guides/USAGE.md](guides/USAGE.md).
## Supported AI CLIs
VoiceTerm is optimized for Codex and Claude Code.
For full backend status and setup details, see
[Usage Guide -> Backend Support](guides/USAGE.md#backend-support).
### Codex
Use the same workflow and controls documented for backend support in
[guides/USAGE.md](guides/USAGE.md#backend-support).
### Claude Code

## Hands-Free Quick Start
```bash
voiceterm --auto-voice --wake-word --voice-send-mode insert
```
Think of this like Alexa for your terminal:
1. Say the wake phrase (`hey codex` or `hey claude`)
2. Speak your prompt
3. Say `send` (or `submit`)
## UI Tour
### Theme Picker

Press `Ctrl+Y` to open Theme Studio and choose `Theme picker`.
Use `Ctrl+G` to cycle themes quickly.
Use `Tab` / `Shift+Tab` to move between Theme Studio pages (`Home`, `Colors`,
`Borders`, `Components`, `Preview`, `Export`).
For editor details, see [Themes](guides/USAGE.md#themes).
For theme-file flags/env vars, see [CLI Flags](guides/CLI_FLAGS.md#themes--display).
### Settings Menu

Mouse control is enabled by default. Open Settings with `Ctrl+O`.
Cursor note: when `Mouse` is ON, wheel/touchpad scrolling may not move chat
history, but the scrollbar can still be dragged. If you prefer touchpad/wheel
scrolling, set `Mouse` to `OFF` and use keyboard focus (`Tab`/arrows) + `Enter`
for HUD buttons.
For details, use:
- [Settings Menu](guides/USAGE.md#settings-menu)
- [Themes](guides/USAGE.md#themes)
- [HUD styles](guides/USAGE.md#hud-styles)
### Transcript History
Use `Ctrl+H` to open transcript history, type to filter, and press `Enter` to
replay into the active CLI input. Mouse click selection is also supported.
History rows are labeled by source (`mic`, `you`, `ai`); only `mic` and `you`
rows are replayable, and `ai` rows are output-only.
Detailed behavior: [Transcript History](guides/USAGE.md#transcript-history).
### Help Overlay
Press `?` to open grouped shortcuts (`Recording`, `Mode`, `Appearance`,
`Sensitivity`, `Navigation`) with clickable Docs/Troubleshooting links on
terminals that support clickable links. Details: [Core Controls](guides/USAGE.md#core-controls).

## Controls
For shortcuts and behavior, see:
- [Core Controls](guides/USAGE.md#core-controls)
- [Settings Menu](guides/USAGE.md#settings-menu)
- [Voice Modes](guides/USAGE.md#voice-modes)
For CLI flags and command-line options:
- `voiceterm --help` (or `voiceterm -h`)
- [CLI Flags](guides/CLI_FLAGS.md)
## Voice Macros
Voice macros are project-local shortcuts in `.voiceterm/macros.yaml`.
Turn macros on in Settings when you want phrase expansion.
Setup and examples: [Project Voice Macros](guides/USAGE.md#project-voice-macros).
## Documentation
| Audience | Document |
|---|---|
| User | [Quick Start](QUICK_START.md) |
| User | [Guides Index](guides/README.md) |
| User | [Install Guide](guides/INSTALL.md) |
| User | [Usage Guide](guides/USAGE.md) |
| User | [CLI Flags](guides/CLI_FLAGS.md) |
| User | [Troubleshooting](guides/TROUBLESHOOTING.md) |
| Developer | [Developer Index](dev/README.md) |
| Developer | [Project Integrations Playbook](dev/integrations/EXTERNAL_REPOS.md) |
| Developer | [Engineering History](dev/history/ENGINEERING_EVOLUTION.md) |
## Support
- Troubleshooting:
[guides/TROUBLESHOOTING.md](guides/TROUBLESHOOTING.md)
- Bug reports and feature requests:
[GitHub Issues](https://github.com/jguida941/voiceterm/issues)
- Security concerns:
[.github/SECURITY.md](.github/SECURITY.md)
## Contributing
PRs welcome. See [CONTRIBUTING.md](.github/CONTRIBUTING.md).
Before opening a PR, run:
- `python3 dev/scripts/devctl.py check --profile prepush`
- `python3 dev/scripts/devctl.py hygiene`
## License
MIT - [LICENSE](LICENSE)