An open API service indexing awesome lists of open source software.

https://github.com/jguida941/voiceterm

Voice-first HUD overlay for AI CLIs (Codex + Claude) with local Whisper STT, PTY passthrough, and customizable terminal UI.
https://github.com/jguida941/voiceterm

ai-cli claude codex developer-tools hands-free linux macos openai-whisper productivity ratatui rust speech-to-text stt terminal tui voice voice-control voice-input voiceterm whisper

Last synced: 3 months ago
JSON representation

Voice-first HUD overlay for AI CLIs (Codex + Claude) with local Whisper STT, PTY passthrough, and customizable terminal UI.

Awesome Lists containing this project

README

          


VoiceTerm


Rust
macOS
Linux
Whisper
Ratatui


VoiceTerm Version
MIT License
CI
Mutation Score
Coverage

VoiceTerm is a voice-first terminal overlay for Codex, Claude, and Gemini.
It runs Whisper on your machine and types what you say into your existing CLI.
Your tools still run in a normal PTY; VoiceTerm just adds a HUD on top.
Use push-to-talk or wake phrases (`hey codex`, `hey claude`), then say
`send` / `submit` for hands-free delivery.

Whisper runs locally by default. No cloud API keys required.
Release history: [dev/CHANGELOG.md](dev/CHANGELOG.md).

## Quick Nav

- [Hands-Free Quick Start](#hands-free-quick-start)
- [Install and Start](#install-and-start)
- [Requirements](#requirements)
- [Features](#features)
- [Supported Backends](#supported-ai-clis)
- [Controls](#controls)
- [Guides Index](guides/README.md)
- [Documentation](#documentation)
- [Support](#support)

## Install and Start

Install one supported AI CLI first:

**Codex:**

```bash
npm install -g @openai/codex
```

**Claude Code:**

```bash
curl -fsSL https://claude.ai/install.sh | bash
```

Then choose one VoiceTerm setup path:

Homebrew (recommended)

```bash
brew tap jguida941/voiceterm
brew install voiceterm
cd ~/your-project
voiceterm
```

If needed, authenticate once:

```bash
voiceterm --login --codex
voiceterm --login --claude
```

PyPI (pipx / pip)

```bash
pipx install voiceterm
# or: python3 -m pip install --user voiceterm

cd ~/your-project
voiceterm
```

If needed, authenticate once:

```bash
voiceterm --login --codex
voiceterm --login --claude
```

From source

Requires Rust toolchain. See [Install Guide](guides/INSTALL.md) for details.

```bash
git clone https://github.com/jguida941/voiceterm.git
cd voiceterm
./scripts/install.sh
```

If you are running from source while developing, run:

```bash
python3 dev/scripts/devctl.py check --profile ci
```

macOS App

Double-click `app/macos/VoiceTerm.app`, pick a folder, and it opens Terminal
with VoiceTerm running.

For model options and startup/IDE tuning:

- [Install Guide](guides/INSTALL.md)
- [Whisper docs](guides/WHISPER.md)
- [Troubleshooting](guides/TROUBLESHOOTING.md)

## How It Works

VoiceTerm listens to your mic, converts speech to text on your machine, and
types the result into your AI CLI input.

![Recording](img/auto-record.png)

## Requirements

- macOS or Linux (Windows needs WSL2)
- Microphone access
- ~0.5 GB disk for the default small model (base is ~142 MB, medium is ~1.5 GB)

## Features

### Main features

| Feature | What it does |
|---------|---------------|
| **Local speech-to-text** | Whisper runs on your machine (no cloud calls) |
| **Fast voice-to-text** | Local Whisper turns speech into text quickly |
| **Keep your CLI as-is** | Your backend CLI layout and behavior stay the same |
| **Auto voice mode** | Keep listening on so you can talk instead of typing |
| **Wake mode + voice send** | Say `hey codex`/`hey claude`, then say `send`/`submit` in insert mode |
| **Image prompts** | Use `Ctrl+X` for one-shot screenshot prompts, or enable persistent image mode for HUD `[rec]` (`IMG` badge) |
| **Transcript queue** | If the CLI is busy, VoiceTerm waits and sends text when ready |
| **Codex + Claude support** | Primary support for Codex and Claude Code |

### Everyday tools

- **Voice macros**: expand phrases from `.voiceterm/macros.yaml` (toggle in Settings)
- **Voice navigation**: spoken `scroll`, `send`, `show last error`, `copy last error`, `explain last error`
- **Dev mode tools**: use `--dev` (`DEV` badge), `Ctrl+D` for Dev panel tools, `--dev-log` for JSONL diagnostics
- **Prompt-safe HUD**: VoiceTerm hides HUD rows during Codex/Claude approval and reply/composer prompts so text stays readable
- **Transcript history**: `Ctrl+H` to search and replay past text
- **Notification history**: `Ctrl+N` to review recent status messages
- **Saved settings**: stored in `~/.config/voiceterm/config.toml`
- **Built-in themes**: 11 themes including ChatGPT, Catppuccin, Dracula, Nord, Tokyo Night, and Gruvbox
- **Style-pack border settings**: `VOICETERM_STYLE_PACK_JSON` supports `components.overlay_border` and `components.hud_border` (HUD applies when border mode is `theme`)

For full behavior details and controls, see [guides/USAGE.md](guides/USAGE.md).

## Supported AI CLIs

VoiceTerm is optimized for Codex and Claude Code.
For full backend status and setup details, see
[Usage Guide -> Backend Support](guides/USAGE.md#backend-support).

### Codex

Use the same workflow and controls documented for backend support in
[guides/USAGE.md](guides/USAGE.md#backend-support).

### Claude Code

![Claude Backend](img/claude-backend.png)

## Hands-Free Quick Start

```bash
voiceterm --auto-voice --wake-word --voice-send-mode insert
```

Think of this like Alexa for your terminal:

1. Say the wake phrase (`hey codex` or `hey claude`)
2. Speak your prompt
3. Say `send` (or `submit`)

## UI Tour

### Theme Picker

![Theme Picker](img/theme-picker.png)
Press `Ctrl+Y` to open Theme Studio and choose `Theme picker`.
Use `Ctrl+G` to cycle themes quickly.
Use `Tab` / `Shift+Tab` to move between Theme Studio pages (`Home`, `Colors`,
`Borders`, `Components`, `Preview`, `Export`).
For editor details, see [Themes](guides/USAGE.md#themes).
For theme-file flags/env vars, see [CLI Flags](guides/CLI_FLAGS.md#themes--display).

### Settings Menu

![Settings](img/settings.png)

Mouse control is enabled by default. Open Settings with `Ctrl+O`.
Cursor note: when `Mouse` is ON, wheel/touchpad scrolling may not move chat
history, but the scrollbar can still be dragged. If you prefer touchpad/wheel
scrolling, set `Mouse` to `OFF` and use keyboard focus (`Tab`/arrows) + `Enter`
for HUD buttons.
For details, use:

- [Settings Menu](guides/USAGE.md#settings-menu)
- [Themes](guides/USAGE.md#themes)
- [HUD styles](guides/USAGE.md#hud-styles)

### Transcript History

Use `Ctrl+H` to open transcript history, type to filter, and press `Enter` to
replay into the active CLI input. Mouse click selection is also supported.
History rows are labeled by source (`mic`, `you`, `ai`); only `mic` and `you`
rows are replayable, and `ai` rows are output-only.
Detailed behavior: [Transcript History](guides/USAGE.md#transcript-history).

### Help Overlay

Press `?` to open grouped shortcuts (`Recording`, `Mode`, `Appearance`,
`Sensitivity`, `Navigation`) with clickable Docs/Troubleshooting links on
terminals that support clickable links. Details: [Core Controls](guides/USAGE.md#core-controls).

![Shortcuts Overlay](img/shortcuts.png)

## Controls

For shortcuts and behavior, see:

- [Core Controls](guides/USAGE.md#core-controls)
- [Settings Menu](guides/USAGE.md#settings-menu)
- [Voice Modes](guides/USAGE.md#voice-modes)

For CLI flags and command-line options:

- `voiceterm --help` (or `voiceterm -h`)
- [CLI Flags](guides/CLI_FLAGS.md)

## Voice Macros

Voice macros are project-local shortcuts in `.voiceterm/macros.yaml`.
Turn macros on in Settings when you want phrase expansion.
Setup and examples: [Project Voice Macros](guides/USAGE.md#project-voice-macros).

## Documentation

| Audience | Document |
|---|---|
| User | [Quick Start](QUICK_START.md) |
| User | [Guides Index](guides/README.md) |
| User | [Install Guide](guides/INSTALL.md) |
| User | [Usage Guide](guides/USAGE.md) |
| User | [CLI Flags](guides/CLI_FLAGS.md) |
| User | [Troubleshooting](guides/TROUBLESHOOTING.md) |
| Developer | [Developer Index](dev/README.md) |
| Developer | [Project Integrations Playbook](dev/integrations/EXTERNAL_REPOS.md) |
| Developer | [Engineering History](dev/history/ENGINEERING_EVOLUTION.md) |

## Support

- Troubleshooting:
[guides/TROUBLESHOOTING.md](guides/TROUBLESHOOTING.md)
- Bug reports and feature requests:
[GitHub Issues](https://github.com/jguida941/voiceterm/issues)
- Security concerns:
[.github/SECURITY.md](.github/SECURITY.md)

## Contributing

PRs welcome. See [CONTRIBUTING.md](.github/CONTRIBUTING.md).
Before opening a PR, run:

- `python3 dev/scripts/devctl.py check --profile prepush`
- `python3 dev/scripts/devctl.py hygiene`

## License

MIT - [LICENSE](LICENSE)