An open API service indexing awesome lists of open source software.

https://github.com/bebsworthy/voicetype

Privacy-first dictation for macOS using local AI models
https://github.com/bebsworthy/voicetype

accessibility coreml dictation macos offline privacy speech-to-text swift swiftui whisper

Last synced: 3 months ago
JSON representation

Privacy-first dictation for macOS using local AI models

Awesome Lists containing this project

README

          

# VoiceType


VoiceType Logo

**Privacy-first dictation for macOS**

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![macOS 12.0+](https://img.shields.io/badge/macOS-12.0+-blue.svg)](https://www.apple.com/macos/)
[![Swift 5.9+](https://img.shields.io/badge/Swift-5.9+-orange.svg)](https://swift.org)
[![Build Status](https://github.com/yourusername/voicetype/workflows/CI/badge.svg)](https://github.com/yourusername/voicetype/actions)

## ๐ŸŽฏ What is VoiceType?

VoiceType is an open-source, privacy-first dictation tool for macOS that converts speech to text using local AI models. Unlike cloud-based solutions, VoiceType processes everything on your device, ensuring your voice never leaves your computer.

### โœจ Key Features

- **๐Ÿ”’ 100% Privacy**: All processing happens on-device. No cloud, no data collection, no internet required
- **๐Ÿš€ Fast & Accurate**: Real-time transcription with <5 second latency using OpenAI Whisper models
- **๐ŸŒ 30+ Languages**: Built-in support for multiple languages with auto-detection
- **โŒจ๏ธ Universal Compatibility**: Works with any macOS application that accepts text input
- **๐ŸŽ›๏ธ Flexible Models**: Choose between speed and accuracy with multiple model sizes
- **๐Ÿ”Œ Extensible**: Plugin system for custom audio processors and text injectors
- **๐Ÿ“– Open Source**: MIT licensed, community-driven development

## ๐Ÿš€ Quick Start

### Download

Download the latest release from the [Releases](https://github.com/yourusername/voicetype/releases) page.

### First Launch

1. **Open VoiceType** - Look for the microphone icon in your menu bar
2. **Grant Permissions** - Allow microphone access when prompted
3. **Choose Your Model** - Select Fast (default) for quick results or Accurate for better quality
4. **Set Your Hotkey** - Default is `Ctrl+Shift+V`
5. **Start Dictating** - Press your hotkey in any app and start speaking!

## ๐Ÿ“‹ System Requirements

- macOS 12.0 (Monterey) or later
- 8GB RAM minimum (16GB recommended for larger models)
- Apple Silicon (M1/M2/M3) or Intel processor
- ~200MB disk space (plus model downloads)

## ๐ŸŽฎ How to Use

1. **Position your cursor** where you want to insert text
2. **Press your hotkey** (default: `Ctrl+Shift+V`)
3. **Speak clearly** for up to 5 seconds
4. **Watch your words appear** - VoiceType automatically inserts the text

### Pro Tips

- Speak naturally at a normal pace
- Minimize background noise for best results
- Use the Accurate model for technical terms
- Customize your hotkey in Settings

## ๐Ÿ› ๏ธ Building from Source

### Prerequisites

- Xcode 15.0 or later
- macOS 13.0+ (for development)
- Apple Developer account (for code signing)

### Build Instructions

```bash
# Clone the repository
git clone https://github.com/yourusername/voicetype.git
cd voicetype/VoiceType

# Setup development environment
./Scripts/setup.sh

# Build the app
./Scripts/build.sh

# Run tests
./Scripts/test.sh

# Create release build
./Scripts/release.sh
```

### Development

```bash
# Open in Xcode
open VoiceType.xcodeproj

# Or use Swift Package Manager
swift build
swift test
```

## ๐Ÿ”ง Configuration

VoiceType can be customized through its settings panel or by editing the configuration file:

`~/Library/Application Support/VoiceType/config.json`

### Available Settings

- **Hotkey**: Customize your recording trigger
- **Model Selection**: Choose between Tiny (fast), Base (balanced), or Small (accurate)
- **Language**: Select from 30+ languages or use auto-detection
- **Audio Device**: Choose your preferred microphone

## ๐Ÿค Contributing

We welcome contributions! Please see our [Contributing Guide](Documentation/DeveloperGuide/Contributing.md) for details.

### Areas for Contribution

- ๐Ÿ”Œ **App-specific text injectors** - Add support for more applications
- ๐ŸŽค **Audio preprocessors** - Improve noise reduction and audio quality
- ๐ŸŒ **Translations** - Help translate the UI to more languages
- ๐Ÿ“š **Documentation** - Improve guides and tutorials
- ๐Ÿ› **Bug fixes** - Help us squash bugs

## ๐Ÿ—๏ธ Architecture

VoiceType uses a modular, protocol-first architecture:

```
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Menu Bar โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚ Coordinator โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚Audio Processorโ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚ โ”‚
โ–ผ โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Transcriber โ”‚โ”€โ”€โ”€โ”€โ–ถโ”‚Text Injector โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
```

See our [Architecture Guide](Documentation/DeveloperGuide/Architecture.md) for details.

## ๐Ÿ› Troubleshooting

### Common Issues

**VoiceType doesn't appear in menu bar**
- Check if the app is running in Activity Monitor
- Try launching from Applications folder

**Hotkey doesn't work**
- Grant Input Monitoring permission in System Settings โ†’ Privacy & Security
- Check for conflicts with other apps

**No text appears after speaking**
- Verify microphone permission is granted
- Check audio input levels in Settings
- Try the clipboard fallback mode

See our [Troubleshooting Guide](Documentation/UserGuide/Troubleshooting.md) for more solutions.

## ๐Ÿ“œ License

VoiceType is released under the MIT License. See [LICENSE](LICENSE) file for details.

## ๐Ÿ™ Acknowledgments

- [OpenAI Whisper](https://github.com/openai/whisper) for the amazing speech recognition models
- [Apple CoreML](https://developer.apple.com/machine-learning/core-ml/) for on-device inference
- The Swift and macOS developer communities

## ๐Ÿ”— Links

- [Documentation](Documentation/UserGuide/README.md)
- [Report Issues](https://github.com/yourusername/voicetype/issues)
- [Discussions](https://github.com/yourusername/voicetype/discussions)
- [Changelog](CHANGELOG.md)

---


Made with โค๏ธ for privacy-conscious users everywhere