An open API service indexing awesome lists of open source software.

https://github.com/codebysonu95/voxsherpa-tts

๐ŸŽ™๏ธ VoxSherpa TTS Offline Neural Text-to-Speech Engine for Android โšก Sherpa-ONNX powered ๐Ÿ”Š Natural voice synthesis ๐Ÿ“ฑ Fully offline processing ๐Ÿš€ No cloud โ€ข No limits
https://github.com/codebysonu95/voxsherpa-tts

android android-ai android-app hindi-tts kokoro-82m kokoro-onnx kokoro-tts local-ai local-first offline-tts on-device-ai piper-tts sherpa-onnx text-to-speech tts-kokoro-android

Last synced: about 1 month ago
JSON representation

๐ŸŽ™๏ธ VoxSherpa TTS Offline Neural Text-to-Speech Engine for Android โšก Sherpa-ONNX powered ๐Ÿ”Š Natural voice synthesis ๐Ÿ“ฑ Fully offline processing ๐Ÿš€ No cloud โ€ข No limits

Awesome Lists containing this project

README

          

VoxSherpa TTS Banner




[![Get it on Google Play](https://img.shields.io/badge/Google_Play-Available_Now-brightgreen?style=for-the-badge&logo=google-play&logoColor=white)](https://play.google.com/store/apps/details?id=com.CodeBySonu.VoxSherpa)
[![Support](https://img.shields.io/badge/๐Ÿ’™_Support-This%20Project-FF5E5B?style=for-the-badge)](https://codebysonu95.github.io/VoxSherpa-TTS/assets/support.html)
[![Android](https://img.shields.io/badge/Android-11%2B-brightgreen?style=for-the-badge&logo=android&logoColor=white)](https://android.com)
[![License](https://img.shields.io/badge/License-GPL%20v3.0-blue?style=for-the-badge)](LICENSE)
[![Sherpa-ONNX](https://img.shields.io/badge/Powered%20by-Sherpa--ONNX-orange?style=for-the-badge)](https://github.com/k2-fsa/sherpa-onnx)
[![Downloads](https://img.shields.io/github/downloads/CodeBySonu95/VoxSherpa-TTS/total?style=for-the-badge&logo=android&logoColor=white&label=Downloads&color=blue)](https://github.com/CodeBySonu95/VoxSherpa-TTS/releases)

VoxSherpa TTS


Studio-quality offline neural text-to-speech for Android.
Hindi ยท English ยท British ยท Japanese ยท Chinese ยท and more โ€” No cloud. No limits. No compromise.

---

## ๐Ÿ† Featured In

> VoxSherpa TTS is listed in the **official README** of [k2-fsa/sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx) โ€” the core inference library powering this app.

[![Sherpa-ONNX](https://img.shields.io/badge/Featured%20in-Sherpa--ONNX%20Official%20README-orange?style=for-the-badge)](https://github.com/k2-fsa/sherpa-onnx#voxsherpa-tts)
[![HuggingFace](https://img.shields.io/badge/Models%20on-HuggingFace-FFD21E?style=for-the-badge&logo=huggingface&logoColor=black)](https://huggingface.co/CodeBySonu95/VoxSherpa-TTS)

---

## Why VoxSherpa?

Most TTS apps make you choose between **quality** and **privacy**. Cloud-based tools like ElevenLabs sound incredible โ€” but they require internet, send your text to remote servers, and charge per character.

**VoxSherpa breaks that tradeoff.**

It runs two professional-grade neural engines entirely on your device:

| Engine | Quality | Speed | Best For |
|--------|---------|-------|----------|
| ๐Ÿง  **Kokoro-82M** | Studio-grade ยท rivals ElevenLabs | Slower on budget hardware | Audiobooks, voiceovers, professional content |
| โšก **Piper / VITS** | Natural ยท clear | Fast on any device | Daily use, quick synthesis |

---

## Screenshots

| Generate | Models | Library | Settings |
|:---:|:---:|:---:|:---:|
| | | | |

---

## Features

### ๐ŸŽ™๏ธ Dual Neural Engine
- **Kokoro-82M** โ€” 82 million parameter neural model. Multilingual support including Hindi, English, British English, French, Spanish, Chinese, Japanese and 50+ more languages. Same architecture used by top-tier commercial TTS services.
- **Piper / VITS** โ€” Fast, lightweight, natural. Generates speech in seconds on any Android device.

### ๐Ÿ”’ 100% Offline & Private
- All processing happens on your device
- No internet required after model download
- No account, no telemetry, no data collection
- Your text never leaves your phone

### ๐Ÿ“ฆ Model Management
- Download models directly from the app
- Import your own `.onnx` models from local storage
- Multiple models installed simultaneously
- Smart storage tracking

### ๐ŸŽง Audio Controls
- Real-time waveform visualization
- Adjustable speed and pitch
- Interactive audio seeking with mini player controls
- MediaStyle notification with full playback controls
- Play, pause, and replay generated audio
- Export as WAV with correct sample rate per model

### ๐Ÿ“š Speech Library
- Save all generated audio locally
- Favorites system for quick access
- View generation history with timestamps
- Voice model attribution per recording

### โš™๏ธ Smart Settings
- **Smart Punctuation** โ€” natural pauses after sentence breaks
- **Emotion Tags** โ€” `[whisper]`, `[angry]`, `[happy]` support
- Per-model voice selection (Kokoro supports 100+ speakers)
- System-wide TTS engine with pitch & speed control
- Theme-aware UI

---

## Technical Architecture

```
User Text
โ”‚
โ”œโ”€โ”€โ”€ Kokoro Engine (KokoroEngine.java)
โ”‚ โ””โ”€โ”€ Sherpa-ONNX JNI โ†’ ONNX Runtime โ†’ CPU/NNAPI
โ”‚ โ””โ”€โ”€ kokoro-multi-lang-v1_0 (82M params, FP32)
โ”‚
โ””โ”€โ”€โ”€ Piper / VITS Engine (VoiceEngine.java)
โ””โ”€โ”€ Sherpa-ONNX JNI โ†’ ONNX Runtime โ†’ CPU
โ””โ”€โ”€ VITS model (language-specific)
```

**Built with:**
- [Sherpa-ONNX](https://github.com/k2-fsa/sherpa-onnx) โ€” on-device neural inference
- [Kokoro-82M](https://huggingface.co/hexgrad/Kokoro-82M) โ€” multilingual neural TTS model
- [Piper](https://github.com/rhasspy/piper) โ€” fast local TTS
- Android AudioTrack API โ€” low-latency PCM playback

---

## Performance

Generation speed depends entirely on your device's processor:

| Device Tier | Kokoro | Piper |
|-------------|--------|-------|
| ๐ŸŸข Flagship (Snapdragon 8 Gen 3) | ~20โ€“40 sec/min audio | ~5 sec/min audio |
| ๐ŸŸก Mid-range (8-core) | ~60โ€“90 sec/min audio | ~10 sec/min audio |
| ๐Ÿ”ด Budget (6-core) | ~2โ€“3 min/min audio | ~20 sec/min audio |

> Kokoro prioritizes **quality over speed** by design. It uses the same 82M parameter architecture that powers premium commercial TTS โ€” running it entirely offline on a mobile CPU is genuinely pushing the hardware limits.

---

## Installation

### ๐ŸŽ‰ Now Live on Google Play!

VoxSherpa TTS is officially available on the **Google Play Store**. No forms, no waitlists โ€” just tap and install.


Get it on Google Play

**Requirements:** Android 11+ ยท ARM64

---

## Changelog

### V2.6 โ€” Media Notification *(Latest)*
- ๐Ÿ”” MediaStyle notification with full playback controls
- ๐ŸŽš๏ธ Pitch control in System TTS
- โšก Speed control in System TTS
- Improved performance and stability
- Bug fixes and optimizations
- Minor UI improvements

### V2.5 โ€” Stability
- Bug fixes and stability improvements
- Improved overall performance

### V2.4 โ€” Bug Fixes
- Improved System TTS support with better language detection
- Enhanced UI & overall app experience
- Improved compatibility for large screen devices
- Various bug fixes

### V2.3 โ€” Playback Upgrade
- Interactive audio seeking
- New mini player controls
- Smoother and faster UI performance
- Fixed cancel generation delay issue

### V2.2 โ€” Core Improvements
- Regenerate audio on voice change
- Improved smart punctuation
- Improved emotion tags
- Pitch control added
- Send feedback feature
- UI/UX improvements

### V1.0 โ€” Foundation
- Text to Audio
- Piper (fast models) + Kokoro (high-quality voices)
- Save audio (.wav) ยท Favorites support
- Speed control ยท Models download ยท Import Custom Model
- Chunk-based playback ยท Smart pause handling
- System TTS integration ยท PDF to Audio ยท TXT to Audio

---

## Model Import (Technical Users)

VoxSherpa supports importing custom `.onnx` models without any server:

1. Place your `.onnx` model + `tokens.txt` on device storage
2. Open **Models tab** โ†’ tap **+** โ†’ **Import Local Model**
3. Select your files

Compatible with any Sherpa-ONNX compatible TTS model.

---

## Contributing

VoxSherpa is open source. Contributions welcome:

- ๐Ÿ› Bug reports via [Issues](../../issues)
- ๐Ÿ’ก Feature requests via [Discussions](../../discussions)
- ๐Ÿ”ง Pull requests for fixes and improvements

---

## License

```
Copyright (C) 2025 CodeBySonu95

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

https://www.gnu.org/licenses/gpl-3.0.html
```

---

## Acknowledgements

- [k2-fsa/sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx) โ€” the inference engine that makes this possible
- [hexgrad/Kokoro-82M](https://huggingface.co/hexgrad/Kokoro-82M) โ€” the neural model behind studio-quality synthesis
- [rhasspy/piper](https://github.com/rhasspy/piper) โ€” fast local TTS engine

---

**Built with obsession. Runs without internet.**

*VoxSherpa โ€” Because your voice deserves to stay yours.*