https://github.com/panda850819/murmur-voice

Privacy-first voice-to-text for macOS and Windows. Local Whisper (Metal/CUDA) or Groq cloud, with LLM post-processing. Built with Rust + Tauri 2.
https://github.com/panda850819/murmur-voice

dictation macos privacy rust speech-to-text tauri transcription voice-to-text whisper windows

Last synced: 4 months ago
JSON representation

Privacy-first voice-to-text for macOS and Windows. Local Whisper (Metal/CUDA) or Groq cloud, with LLM post-processing. Built with Rust + Tauri 2.

Host: GitHub
URL: https://github.com/panda850819/murmur-voice
Owner: panda850819
License: mit
Created: 2026-02-13T06:08:36.000Z (4 months ago)
Default Branch: main
Last Pushed: 2026-02-28T02:14:07.000Z (4 months ago)
Last Synced: 2026-02-28T11:49:01.936Z (4 months ago)
Topics: dictation, macos, privacy, rust, speech-to-text, tauri, transcription, voice-to-text, whisper, windows
Language: Rust
Homepage: https://github.com/panda850819/murmur-voice/releases
Size: 3.41 MB
Stars: 12
Watchers: 0
Forks: 2
Open Issues: 11
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
- Roadmap: ROADMAP.md

Awesome Lists containing this project

README

          # Murmur

[![Release](https://img.shields.io/github/v/release/panda850819/murmur-voice?include_prereleases&style=flat-square)](https://github.com/panda850819/murmur-voice/releases)

[![CI](https://img.shields.io/github/actions/workflow/status/panda850819/murmur-voice/ci.yml?branch=main&style=flat-square&label=CI)](https://github.com/panda850819/murmur-voice/actions/workflows/ci.yml)

[![License](https://img.shields.io/github/license/panda850819/murmur-voice?style=flat-square)](LICENSE)

[![Platform](https://img.shields.io/badge/platform-macOS%20%7C%20Windows-blue?style=flat-square)]()

**[English](README.md)** | **[繁體中文](README.zh-TW.md)**



> Your voice, unheard by others.

Privacy-first voice-to-text for macOS and Windows, built with Rust.



  

  



## What is Murmur?

Murmur is a voice dictation tool that transcribes your speech and inserts polished text at your cursor position -- in any app. It supports both local (on-device) and cloud transcription, with optional LLM post-processing to clean up filler words, fix punctuation, and convert Simplified Chinese to Traditional Chinese.

## Features

- **Push-to-Talk** -- Hold a modifier key to speak, release to insert text

- **Toggle Mode** -- Press once to start recording, press again to stop (with 5-min auto-stop and debounce protection)

- **Custom Hotkey** -- Single modifier key or combo (e.g. Option+Z, Control+Space) with two-phase recording

- **Dual Engine** -- Local Whisper (Metal GPU) or Groq cloud API

- **Multi-Provider LLM** -- Groq (cloud), Ollama (local), or any OpenAI-compatible endpoint for text enhancement

- **Fully Offline Mode** -- Local Whisper + Ollama for complete privacy (no data leaves your machine)

- **LLM Post-Processing** -- Clean up filler words, add punctuation, Simplified-to-Traditional Chinese conversion

- **Smart Clipboard** -- Auto-pastes when a text field is focused; copies to clipboard only when no text input is detected (e.g. on Desktop)

- **App-Aware Style** -- Automatically adjusts output tone based on the active app (e.g. formal in Slack, technical in VS Code)

- **Personal Dictionary** -- Add custom terms to improve transcription accuracy; inline dictionary chips appear in real-time while editing

- **Transcription Preview** -- Floating preview window with copy button, editable text, character count, and detected app name

- **Live Preview** -- See partial transcription while you speak (local engine only)

- **Mixed-Language Support** -- English words in mixed CJK-English speech are preserved as-is (never translated)

- **15 Languages** -- Auto-detect or manually select from 15 supported languages

- **Cross-Platform** -- macOS and Windows support with platform-native hotkey and app detection

- **System-wide** -- Works in any text field across all apps

- **Lightweight** -- Tauri-based, ~30-50MB vs 200MB+ Electron apps

- **Open Source** -- Fully auditable, no telemetry, no tracking

## Download

Download the latest release from the [Releases page](https://github.com/panda850819/murmur-voice/releases).

| Platform | File | Notes |

|----------|------|-------|

| macOS (Apple Silicon) | `.dmg` | Requires [quarantine removal](#macos-murmur-voice-is-damaged-and-cant-be-opened) |

| Windows | `.exe` / `.msi` | CPU-only, works on all hardware |

| Windows (NVIDIA GPU) | `-cuda.exe` / `-cuda.msi` | GPU-accelerated via CUDA |

## How It Works

```

Hotkey -> Record (cpal) -> Transcribe (Whisper) -> LLM Clean-up (optional) -> Smart Clipboard (paste or copy-only)

```

**Each recording triggers at most 2 API calls** (when using Groq): one for Whisper transcription, one for LLM post-processing.

## Setup Guide

### 1. Install & Run

```bash

git clone https://github.com/panda850819/murmur-voice.git

cd murmur-voice

pnpm install

pnpm tauri dev

```

### 2. First Launch

On first launch, Murmur will guide you through:

1. Granting **Microphone** and **Accessibility** permissions

2. Choosing a transcription engine (Local or Groq)

3. Setting your Push-to-Talk key

If you choose the local engine, the Whisper model (~1.5GB) will download automatically on your first recording.

### 3. Transcription Engine

| Engine | Speed | Quality | Privacy | Setup |

|--------|-------|---------|---------|-------|

| **Local (Whisper)** | ~1-3s | Good | Audio stays on device | Model auto-downloads on first use (~1.5GB) |

| **Groq API** | <1s | Good | Audio sent to Groq servers | Free API key ([get one below](#getting-a-groq-api-key)) |

To switch engines: **Settings > Transcription > Engine**

#### Getting a Groq API Key

1. Go to [console.groq.com](https://console.groq.com) and sign up (Google/GitHub login supported)

2. Navigate to **API Keys** in the left sidebar

3. Click **Create API Key**, give it a name (e.g. "murmur")

4. Copy the key (starts with `gsk_`) and paste it into Murmur's settings

Groq's free tier includes generous rate limits for personal use. The same API key is used for both Whisper transcription and LLM post-processing.

### 4. LLM Post-Processing (Recommended)

Choose a provider for AI text enhancement:

| Provider | Speed | Privacy | Setup |

|----------|-------|---------|-------|

| **Groq** | Fast | Cloud | Free API key from [console.groq.com](https://console.groq.com) |

| **Ollama** | Varies | Local | Install [Ollama](https://ollama.com), pull a model |

| **Custom** | Varies | Varies | Any OpenAI-compatible endpoint |

What it does:

- Removes filler words (um, uh, etc.)

- Removes false starts and self-corrections

- Adds proper punctuation (full-width for Chinese, half-width for English)

- Converts Simplified Chinese to Traditional Chinese (Taiwan standard)

- Adds spaces between Chinese and English text

- Formats lists and paragraphs when appropriate

To enable: **Settings > AI Processing > LLM Post-Processing**

### 5. Personal Dictionary

Add frequently used terms (names, jargon, acronyms) to improve transcription accuracy. These are injected into Whisper's initial prompt.

To configure: **Settings > Transcription > Dictionary** (type a term, press Enter to add)

### 6. App-Aware Style

When enabled, Murmur detects the foreground app and adjusts the LLM output tone:

| App | Style |

|-----|-------|

| Slack, Discord, LINE, Telegram | Casual |

| VS Code, Terminal, Cursor | Technical |

| Pages, Word, Google Docs | Formal |

| Others | Default (natural) |

To enable: **Settings > AI Processing > App-Aware Style**

## Recommended Settings

For the best experience with Chinese dictation:

| Setting | Value | Why |

|---------|-------|-----|

| Engine | **Groq** | Fastest transcription (<1s) |

| Language | **Mandarin Chinese** | More accurate than Auto for Chinese |

| LLM Post-Processing | **On** | Cleans up filler words + Traditional Chinese |

| LLM Model | **Llama 3.3 70B** | Best quality for Chinese text processing |

| App-Aware Style | **On** | Adapts tone to context |

## Tech Stack

| Component | Technology | Purpose |

|-----------|-----------|---------|

| App Framework | Tauri 2 | Lightweight desktop app |

| Audio Capture | cpal | Microphone input -> 16kHz mono |

| Speech-to-Text | whisper-rs / Groq API | Local or cloud transcription |

| LLM Processing | Groq / Ollama / Custom | Text cleanup and formatting |

| Hotkey Detection | CGEventTap / SetWindowsHookEx | Global hotkey listener (modifier or modifier+key combo) |

| Text Insertion | arboard + rdev | Clipboard write + Cmd+V / Ctrl+V simulation |

| App Detection | NSWorkspace / Win32 API | Foreground app detection (per-platform) |

## Requirements

### macOS

- macOS 12.0+ (Apple Silicon recommended for local Whisper)

- Microphone permission

- Accessibility permission (for global hotkey + text insertion)

### Windows

- Windows 10+

- Microphone permission

### Both Platforms

- Groq API key (free, for cloud engine and Groq LLM) or Ollama (for local LLM)

## FAQ

### macOS: "Murmur Voice is damaged and can't be opened"

This happens because the app is not signed with an Apple Developer certificate. macOS Gatekeeper quarantines unsigned apps by default. To fix:

1. Move Murmur Voice to `/Applications`

2. Open Terminal and run:

   ```bash

   xattr -d com.apple.quarantine /Applications/Murmur\ Voice.app

   ```

3. Open the app normally

### Windows: Which version should I download?

| Your GPU | Download | Why |

|----------|----------|-----|

| NVIDIA (with CUDA drivers) | `-cuda` version | GPU-accelerated transcription, much faster |

| AMD / Intel / integrated | Standard version | CPU transcription, works on all hardware |

| Not sure | Standard version | Always works, just slower for local engine |

### Why is the app unsigned?

Murmur is a free, open-source project. Apple Developer Program costs $99/year. Code signing may be added in the future, but for now the workaround above is required on macOS.

## Privacy

Murmur was born from a security audit of a commercial voice-to-text app that was found to:

- Capture browser URLs and window titles

- Monitor all keystrokes via CGEventTap

- Send application context to remote servers

- Include session recording analytics (Microsoft Clarity)

Murmur does none of this. When using the **local engine**, your audio never leaves your machine. When using **Groq**, audio is sent only to Groq's API for transcription -- no other data is collected or transmitted.

## Donate

If you find Murmur useful, consider supporting the project:



**Crypto:**

| Network | Address |

|---------|---------|

| EVM (Ethereum, Base, etc.) | `0x9ae8954201b2fce97b124887e415df02e8e06a8d` |

| Solana | `Eod4VqvMmmMnY3EinN6Zo5xzt9Wq5S2dFZutob1VBvMf` |

## License

MIT

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/panda850819/murmur-voice

Awesome Lists containing this project

README