https://github.com/evgeniyarbatov/vibing
Local voice-to-clipboard transcription on macOS — tap a key to record, get clean text on your clipboard, no cloud needed.
https://github.com/evgeniyarbatov/vibing
clipboard faster-whisper local-ai macos offline ollama privacy speech-to-text transcription voice-recorder
Last synced: about 7 hours ago
JSON representation
Local voice-to-clipboard transcription on macOS — tap a key to record, get clean text on your clipboard, no cloud needed.
- Host: GitHub
- URL: https://github.com/evgeniyarbatov/vibing
- Owner: evgeniyarbatov
- License: mit
- Created: 2026-05-08T02:54:47.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2026-06-25T11:43:01.000Z (7 days ago)
- Last Synced: 2026-06-25T13:19:37.539Z (7 days ago)
- Topics: clipboard, faster-whisper, local-ai, macos, offline, ollama, privacy, speech-to-text, transcription, voice-recorder
- Language: Python
- Homepage:
- Size: 59.6 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Roadmap: ROADMAP.md
Awesome Lists containing this project
README
# vibing
Press the **right Option key** to start recording. Press it again to stop. Your speech is transcribed, cleaned up, and copied to the clipboard — all locally, no cloud needed.
Inspired by [Vibing](https://vibingjustspeakit.github.io/Vibing/).
## Pipeline
```
mic → raw WAV → ffmpeg loudnorm → whisper small → ollama mistral-nemo → clipboard
```
| Step | Tool | Output |
|------|------|--------|
| Record | sounddevice (Python) | `data/raw-audio/TIMESTAMP.wav` |
| Normalize | ffmpeg `loudnorm` | `data/normalized-audio/TIMESTAMP.wav` |
| Transcribe | faster-whisper `small` | `data/raw-transcript/TIMESTAMP.txt` |
| Clean up | ollama `mistral-nemo` | `data/clean-transcript/TIMESTAMP.txt` |
| Deliver | pbcopy | clipboard |
---
## Prerequisites
### 1. ffmpeg
```bash
brew install ffmpeg
```
### 2. faster-whisper
```bash
pip install faster-whisper
```
The first transcription downloads the `small` model (~244 MB) automatically.
### 3. Ollama + mistral-nemo
```bash
# Install ollama
brew install ollama
# Pull the model
ollama pull mistral-nemo
# Start the server
ollama serve
```
### 4. Python 3.10+
```bash
brew install python
```
---
## Configuration
Edit **`config.json`** in the project root. Changes take effect on the next restart.
```json
{
"ollama_model": "mistral-nemo",
"ollama_prompt": "Clean up this voice transcription. ... {transcription}",
"output_dir": "~/Documents/vibing"
}
```
| Key | Default | Description |
|-----|---------|-------------|
| `ollama_model` | `mistral-nemo` | Any model available in your local ollama (`ollama list`) |
| `ollama_prompt` | *(see file)* | Prompt sent to ollama. Must contain `{transcription}` — that placeholder is replaced with the raw whisper output |
| `output_dir` | `~/Documents/vibing` | Root for all four output subdirectories. Relative paths are resolved from the project root; absolute paths are used as-is |
## Quick start
```bash
make run
```
Runs the recorder in your terminal. Log output appears in real time and is also written to `logs/recorder.log`. Press **right Option** to start/stop recording. Stop with `Ctrl-C`.
### Grant Accessibility access (required)
`pynput` needs the Accessibility permission to read global keyboard events.
1. Open **System Settings → Privacy & Security → Accessibility**.
2. Click **+** and add `.venv/bin/python3` inside this project directory.
- If running from Terminal, add **Terminal** (or iTerm2) instead.
Without this step the key listener silently does nothing.
### Makefile targets
| Command | What it does |
|---------|-------------|
| `make setup` | Create virtualenv and install Python dependencies |
| `make run` | Start the recorder in the foreground |
| `make test` | Run the test suite |
| `make clean` | Remove virtualenv and all captured data |
## Usage
| Action | Gesture |
|--------|---------|
| Start recording | Tap right Option key once |
| Stop recording | Tap right Option key again |
The cleaned transcript is always saved to `data/clean-transcript/` even if you use the clipboard version.
## Troubleshooting
### No key events detected
→ Accessibility permission not granted. See [Grant Accessibility access](#grant-accessibility-access-required).
### faster-whisper import error
→ Run `pip install faster-whisper` (or `make setup`) and make sure your virtualenv is active.
### ollama connection refused
→ Start ollama: `ollama serve` or install it as a service via `brew services start ollama`.
### First transcription is slow
→ Whisper downloads the `small` model (~244 MB) on first use. Subsequent runs are fast.