An open API service indexing awesome lists of open source software.

https://github.com/panda850819/murmur-voice

Privacy-first voice-to-text for macOS and Windows. Local Whisper (Metal/CUDA) or Groq cloud, with LLM post-processing. Built with Rust + Tauri 2.
https://github.com/panda850819/murmur-voice

dictation macos privacy rust speech-to-text tauri transcription voice-to-text whisper windows

Last synced: 4 months ago
JSON representation

Privacy-first voice-to-text for macOS and Windows. Local Whisper (Metal/CUDA) or Groq cloud, with LLM post-processing. Built with Rust + Tauri 2.

Awesome Lists containing this project

README

          

# Murmur

[![Release](https://img.shields.io/github/v/release/panda850819/murmur-voice?include_prereleases&style=flat-square)](https://github.com/panda850819/murmur-voice/releases)
[![CI](https://img.shields.io/github/actions/workflow/status/panda850819/murmur-voice/ci.yml?branch=main&style=flat-square&label=CI)](https://github.com/panda850819/murmur-voice/actions/workflows/ci.yml)
[![License](https://img.shields.io/github/license/panda850819/murmur-voice?style=flat-square)](LICENSE)
[![Platform](https://img.shields.io/badge/platform-macOS%20%7C%20Windows-blue?style=flat-square)]()

**[English](README.md)** | **[繁體中文](README.zh-TW.md)**

Buy Me A Coffee

> Your voice, unheard by others.

Privacy-first voice-to-text for macOS and Windows, built with Rust.


Settings
Recording

## What is Murmur?

Murmur is a voice dictation tool that transcribes your speech and inserts polished text at your cursor position -- in any app. It supports both local (on-device) and cloud transcription, with optional LLM post-processing to clean up filler words, fix punctuation, and convert Simplified Chinese to Traditional Chinese.

## Features

- **Push-to-Talk** -- Hold a modifier key to speak, release to insert text
- **Toggle Mode** -- Press once to start recording, press again to stop (with 5-min auto-stop and debounce protection)
- **Custom Hotkey** -- Single modifier key or combo (e.g. Option+Z, Control+Space) with two-phase recording
- **Dual Engine** -- Local Whisper (Metal GPU) or Groq cloud API
- **Multi-Provider LLM** -- Groq (cloud), Ollama (local), or any OpenAI-compatible endpoint for text enhancement
- **Fully Offline Mode** -- Local Whisper + Ollama for complete privacy (no data leaves your machine)
- **LLM Post-Processing** -- Clean up filler words, add punctuation, Simplified-to-Traditional Chinese conversion
- **Smart Clipboard** -- Auto-pastes when a text field is focused; copies to clipboard only when no text input is detected (e.g. on Desktop)
- **App-Aware Style** -- Automatically adjusts output tone based on the active app (e.g. formal in Slack, technical in VS Code)
- **Personal Dictionary** -- Add custom terms to improve transcription accuracy; inline dictionary chips appear in real-time while editing
- **Transcription Preview** -- Floating preview window with copy button, editable text, character count, and detected app name
- **Live Preview** -- See partial transcription while you speak (local engine only)
- **Mixed-Language Support** -- English words in mixed CJK-English speech are preserved as-is (never translated)
- **15 Languages** -- Auto-detect or manually select from 15 supported languages
- **Cross-Platform** -- macOS and Windows support with platform-native hotkey and app detection
- **System-wide** -- Works in any text field across all apps
- **Lightweight** -- Tauri-based, ~30-50MB vs 200MB+ Electron apps
- **Open Source** -- Fully auditable, no telemetry, no tracking

## Download

Download the latest release from the [Releases page](https://github.com/panda850819/murmur-voice/releases).

| Platform | File | Notes |
|----------|------|-------|
| macOS (Apple Silicon) | `.dmg` | Requires [quarantine removal](#macos-murmur-voice-is-damaged-and-cant-be-opened) |
| Windows | `.exe` / `.msi` | CPU-only, works on all hardware |
| Windows (NVIDIA GPU) | `-cuda.exe` / `-cuda.msi` | GPU-accelerated via CUDA |

## How It Works

```
Hotkey -> Record (cpal) -> Transcribe (Whisper) -> LLM Clean-up (optional) -> Smart Clipboard (paste or copy-only)
```

**Each recording triggers at most 2 API calls** (when using Groq): one for Whisper transcription, one for LLM post-processing.

## Setup Guide

### 1. Install & Run

```bash
git clone https://github.com/panda850819/murmur-voice.git
cd murmur-voice
pnpm install
pnpm tauri dev
```

### 2. First Launch

On first launch, Murmur will guide you through:
1. Granting **Microphone** and **Accessibility** permissions
2. Choosing a transcription engine (Local or Groq)
3. Setting your Push-to-Talk key

If you choose the local engine, the Whisper model (~1.5GB) will download automatically on your first recording.

### 3. Transcription Engine

| Engine | Speed | Quality | Privacy | Setup |
|--------|-------|---------|---------|-------|
| **Local (Whisper)** | ~1-3s | Good | Audio stays on device | Model auto-downloads on first use (~1.5GB) |
| **Groq API** | <1s | Good | Audio sent to Groq servers | Free API key ([get one below](#getting-a-groq-api-key)) |

To switch engines: **Settings > Transcription > Engine**

#### Getting a Groq API Key

1. Go to [console.groq.com](https://console.groq.com) and sign up (Google/GitHub login supported)
2. Navigate to **API Keys** in the left sidebar
3. Click **Create API Key**, give it a name (e.g. "murmur")
4. Copy the key (starts with `gsk_`) and paste it into Murmur's settings

Groq's free tier includes generous rate limits for personal use. The same API key is used for both Whisper transcription and LLM post-processing.

### 4. LLM Post-Processing (Recommended)

Choose a provider for AI text enhancement:

| Provider | Speed | Privacy | Setup |
|----------|-------|---------|-------|
| **Groq** | Fast | Cloud | Free API key from [console.groq.com](https://console.groq.com) |
| **Ollama** | Varies | Local | Install [Ollama](https://ollama.com), pull a model |
| **Custom** | Varies | Varies | Any OpenAI-compatible endpoint |

What it does:
- Removes filler words (um, uh, etc.)
- Removes false starts and self-corrections
- Adds proper punctuation (full-width for Chinese, half-width for English)
- Converts Simplified Chinese to Traditional Chinese (Taiwan standard)
- Adds spaces between Chinese and English text
- Formats lists and paragraphs when appropriate

To enable: **Settings > AI Processing > LLM Post-Processing**

### 5. Personal Dictionary

Add frequently used terms (names, jargon, acronyms) to improve transcription accuracy. These are injected into Whisper's initial prompt.

To configure: **Settings > Transcription > Dictionary** (type a term, press Enter to add)

### 6. App-Aware Style

When enabled, Murmur detects the foreground app and adjusts the LLM output tone:

| App | Style |
|-----|-------|
| Slack, Discord, LINE, Telegram | Casual |
| VS Code, Terminal, Cursor | Technical |
| Pages, Word, Google Docs | Formal |
| Others | Default (natural) |

To enable: **Settings > AI Processing > App-Aware Style**

## Recommended Settings

For the best experience with Chinese dictation:

| Setting | Value | Why |
|---------|-------|-----|
| Engine | **Groq** | Fastest transcription (<1s) |
| Language | **Mandarin Chinese** | More accurate than Auto for Chinese |
| LLM Post-Processing | **On** | Cleans up filler words + Traditional Chinese |
| LLM Model | **Llama 3.3 70B** | Best quality for Chinese text processing |
| App-Aware Style | **On** | Adapts tone to context |

## Tech Stack

| Component | Technology | Purpose |
|-----------|-----------|---------|
| App Framework | Tauri 2 | Lightweight desktop app |
| Audio Capture | cpal | Microphone input -> 16kHz mono |
| Speech-to-Text | whisper-rs / Groq API | Local or cloud transcription |
| LLM Processing | Groq / Ollama / Custom | Text cleanup and formatting |
| Hotkey Detection | CGEventTap / SetWindowsHookEx | Global hotkey listener (modifier or modifier+key combo) |
| Text Insertion | arboard + rdev | Clipboard write + Cmd+V / Ctrl+V simulation |
| App Detection | NSWorkspace / Win32 API | Foreground app detection (per-platform) |

## Requirements

### macOS
- macOS 12.0+ (Apple Silicon recommended for local Whisper)
- Microphone permission
- Accessibility permission (for global hotkey + text insertion)

### Windows
- Windows 10+
- Microphone permission

### Both Platforms
- Groq API key (free, for cloud engine and Groq LLM) or Ollama (for local LLM)

## FAQ

### macOS: "Murmur Voice is damaged and can't be opened"

This happens because the app is not signed with an Apple Developer certificate. macOS Gatekeeper quarantines unsigned apps by default. To fix:

1. Move Murmur Voice to `/Applications`
2. Open Terminal and run:
```bash
xattr -d com.apple.quarantine /Applications/Murmur\ Voice.app
```
3. Open the app normally

### Windows: Which version should I download?

| Your GPU | Download | Why |
|----------|----------|-----|
| NVIDIA (with CUDA drivers) | `-cuda` version | GPU-accelerated transcription, much faster |
| AMD / Intel / integrated | Standard version | CPU transcription, works on all hardware |
| Not sure | Standard version | Always works, just slower for local engine |

### Why is the app unsigned?

Murmur is a free, open-source project. Apple Developer Program costs $99/year. Code signing may be added in the future, but for now the workaround above is required on macOS.

## Privacy

Murmur was born from a security audit of a commercial voice-to-text app that was found to:
- Capture browser URLs and window titles
- Monitor all keystrokes via CGEventTap
- Send application context to remote servers
- Include session recording analytics (Microsoft Clarity)

Murmur does none of this. When using the **local engine**, your audio never leaves your machine. When using **Groq**, audio is sent only to Groq's API for transcription -- no other data is collected or transmitted.

## Donate

If you find Murmur useful, consider supporting the project:

Buy Me A Coffee

**Crypto:**

| Network | Address |
|---------|---------|
| EVM (Ethereum, Base, etc.) | `0x9ae8954201b2fce97b124887e415df02e8e06a8d` |
| Solana | `Eod4VqvMmmMnY3EinN6Zo5xzt9Wq5S2dFZutob1VBvMf` |

## License

MIT