{"id":47608615,"url":"https://github.com/codebysonu95/voxsherpa-tts","last_synced_at":"2026-04-26T04:00:59.371Z","repository":{"id":345714259,"uuid":"1170023059","full_name":"CodeBySonu95/VoxSherpa-TTS","owner":"CodeBySonu95","description":"🎙️ VoxSherpa TTS   Offline Neural Text-to-Speech Engine for Android  ⚡ Sherpa-ONNX powered   🔊 Natural voice synthesis   📱 Fully offline processing   🚀 No cloud • No limits","archived":false,"fork":false,"pushed_at":"2026-04-25T08:32:04.000Z","size":49603,"stargazers_count":58,"open_issues_count":6,"forks_count":9,"subscribers_count":2,"default_branch":"main","last_synced_at":"2026-04-25T09:28:40.580Z","etag":null,"topics":["android","android-ai","android-app","hindi-tts","kokoro-82m","kokoro-onnx","kokoro-tts","local-ai","local-first","offline-tts","on-device-ai","piper-tts","sherpa-onnx","text-to-speech","tts-kokoro-android"],"latest_commit_sha":null,"homepage":"https://codebysonu95.github.io/VoxSherpa-TTS/","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/CodeBySonu95.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-01T15:34:09.000Z","updated_at":"2026-04-25T07:45:06.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/CodeBySonu95/VoxSherpa-TTS","commit_stats":null,"previous_names":["codebysonu95/voxsherpa-tts"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/CodeBySonu95/VoxSherpa-TTS","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CodeBySonu95%2FVoxSherpa-TTS","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CodeBySonu95%2FVoxSherpa-TTS/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CodeBySonu95%2FVoxSherpa-TTS/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CodeBySonu95%2FVoxSherpa-TTS/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/CodeBySonu95","download_url":"https://codeload.github.com/CodeBySonu95/VoxSherpa-TTS/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CodeBySonu95%2FVoxSherpa-TTS/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32285283,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-25T18:29:39.964Z","status":"online","status_checked_at":"2026-04-26T02:00:05.962Z","response_time":129,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["android","android-ai","android-app","hindi-tts","kokoro-82m","kokoro-onnx","kokoro-tts","local-ai","local-first","offline-tts","on-device-ai","piper-tts","sherpa-onnx","text-to-speech","tts-kokoro-android"],"created_at":"2026-04-01T19:45:12.367Z","updated_at":"2026-04-26T04:00:59.364Z","avatar_url":"https://github.com/CodeBySonu95.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\n\u003cimg src=\"https://raw.githubusercontent.com/CodeBySonu95/VoxSherpa-TTS/main/fastlane/metadata/android/en-US/images/featureGraphic.png\" width=\"100%\" alt=\"VoxSherpa TTS Banner\"/\u003e\n\n\u003cbr/\u003e\n\u003cbr/\u003e\n\n[![Get it on Google Play](https://img.shields.io/badge/Google_Play-Available_Now-brightgreen?style=for-the-badge\u0026logo=google-play\u0026logoColor=white)](https://play.google.com/store/apps/details?id=com.CodeBySonu.VoxSherpa)\n[![Support](https://img.shields.io/badge/💙_Support-This%20Project-FF5E5B?style=for-the-badge)](https://codebysonu95.github.io/VoxSherpa-TTS/assets/support.html)\n[![Android](https://img.shields.io/badge/Android-11%2B-brightgreen?style=for-the-badge\u0026logo=android\u0026logoColor=white)](https://android.com)\n[![License](https://img.shields.io/badge/License-GPL%20v3.0-blue?style=for-the-badge)](LICENSE)\n[![Sherpa-ONNX](https://img.shields.io/badge/Powered%20by-Sherpa--ONNX-orange?style=for-the-badge)](https://github.com/k2-fsa/sherpa-onnx)\n[![Downloads](https://img.shields.io/github/downloads/CodeBySonu95/VoxSherpa-TTS/total?style=for-the-badge\u0026logo=android\u0026logoColor=white\u0026label=Downloads\u0026color=blue)](https://github.com/CodeBySonu95/VoxSherpa-TTS/releases)\n\n\u003ch1\u003eVoxSherpa TTS\u003c/h1\u003e\n\u003ch3\u003eStudio-quality offline neural text-to-speech for Android.\u003cbr/\u003eHindi · English · British · Japanese · Chinese · and more — No cloud. No limits. No compromise.\u003c/h3\u003e\n\n\u003c/div\u003e\n\n---\n\n## 🏆 Featured In\n\n\u003e VoxSherpa TTS is listed in the **official README** of [k2-fsa/sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx) — the core inference library powering this app.\n\n[![Sherpa-ONNX](https://img.shields.io/badge/Featured%20in-Sherpa--ONNX%20Official%20README-orange?style=for-the-badge)](https://github.com/k2-fsa/sherpa-onnx#voxsherpa-tts)\n[![HuggingFace](https://img.shields.io/badge/Models%20on-HuggingFace-FFD21E?style=for-the-badge\u0026logo=huggingface\u0026logoColor=black)](https://huggingface.co/CodeBySonu95/VoxSherpa-TTS)\n\n---\n\n## Why VoxSherpa?\n\nMost TTS apps make you choose between **quality** and **privacy**. Cloud-based tools like ElevenLabs sound incredible — but they require internet, send your text to remote servers, and charge per character.\n\n**VoxSherpa breaks that tradeoff.**\n\nIt runs two professional-grade neural engines entirely on your device:\n\n| Engine | Quality | Speed | Best For |\n|--------|---------|-------|----------|\n| 🧠 **Kokoro-82M** | Studio-grade · rivals ElevenLabs | Slower on budget hardware | Audiobooks, voiceovers, professional content |\n| ⚡ **Piper / VITS** | Natural · clear | Fast on any device | Daily use, quick synthesis |\n\n---\n\n## Screenshots\n\n\u003cdiv align=\"center\"\u003e\n\n| Generate | Models | Library | Settings |\n|:---:|:---:|:---:|:---:|\n| \u003cimg src=\"https://raw.githubusercontent.com/CodeBySonu95/VoxSherpa-TTS/main/fastlane/metadata/android/en-US/images/phoneScreenshots/1.jpg\" width=\"180\"/\u003e | \u003cimg src=\"https://raw.githubusercontent.com/CodeBySonu95/VoxSherpa-TTS/main/fastlane/metadata/android/en-US/images/phoneScreenshots/2.jpg\" width=\"180\"/\u003e | \u003cimg src=\"https://raw.githubusercontent.com/CodeBySonu95/VoxSherpa-TTS/main/fastlane/metadata/android/en-US/images/phoneScreenshots/3.jpg\" width=\"180\"/\u003e | \u003cimg src=\"https://raw.githubusercontent.com/CodeBySonu95/VoxSherpa-TTS/main/fastlane/metadata/android/en-US/images/phoneScreenshots/4.jpg\" width=\"180\"/\u003e |\n\n\u003c/div\u003e\n\n---\n\n## Features\n\n### 🎙️ Dual Neural Engine\n- **Kokoro-82M** — 82 million parameter neural model. Multilingual support including Hindi, English, British English, French, Spanish, Chinese, Japanese and 50+ more languages. Same architecture used by top-tier commercial TTS services.\n- **Piper / VITS** — Fast, lightweight, natural. Generates speech in seconds on any Android device.\n\n### 🔒 100% Offline \u0026 Private\n- All processing happens on your device\n- No internet required after model download\n- No account, no telemetry, no data collection\n- Your text never leaves your phone\n\n### 📦 Model Management\n- Download models directly from the app\n- Import your own `.onnx` models from local storage\n- Multiple models installed simultaneously\n- Smart storage tracking\n\n### 🎧 Audio Controls\n- Real-time waveform visualization\n- Adjustable speed and pitch\n- Interactive audio seeking with mini player controls\n- MediaStyle notification with full playback controls\n- Play, pause, and replay generated audio\n- Export as WAV with correct sample rate per model\n\n### 📚 Speech Library\n- Save all generated audio locally\n- Favorites system for quick access\n- View generation history with timestamps\n- Voice model attribution per recording\n\n### ⚙️ Smart Settings\n- **Smart Punctuation** — natural pauses after sentence breaks\n- **Emotion Tags** — `[whisper]`, `[angry]`, `[happy]` support\n- Per-model voice selection (Kokoro supports 100+ speakers)\n- System-wide TTS engine with pitch \u0026 speed control\n- Theme-aware UI\n\n---\n\n## Technical Architecture\n\n```\nUser Text\n    │\n    ├─── Kokoro Engine (KokoroEngine.java)\n    │         └── Sherpa-ONNX JNI → ONNX Runtime → CPU/NNAPI\n    │                   └── kokoro-multi-lang-v1_0 (82M params, FP32)\n    │\n    └─── Piper / VITS Engine (VoiceEngine.java)\n              └── Sherpa-ONNX JNI → ONNX Runtime → CPU\n                        └── VITS model (language-specific)\n```\n\n**Built with:**\n- [Sherpa-ONNX](https://github.com/k2-fsa/sherpa-onnx) — on-device neural inference\n- [Kokoro-82M](https://huggingface.co/hexgrad/Kokoro-82M) — multilingual neural TTS model\n- [Piper](https://github.com/rhasspy/piper) — fast local TTS\n- Android AudioTrack API — low-latency PCM playback\n\n---\n\n## Performance\n\nGeneration speed depends entirely on your device's processor:\n\n| Device Tier | Kokoro | Piper |\n|-------------|--------|-------|\n| 🟢 Flagship (Snapdragon 8 Gen 3) | ~20–40 sec/min audio | ~5 sec/min audio |\n| 🟡 Mid-range (8-core) | ~60–90 sec/min audio | ~10 sec/min audio |\n| 🔴 Budget (6-core) | ~2–3 min/min audio | ~20 sec/min audio |\n\n\u003e Kokoro prioritizes **quality over speed** by design. It uses the same 82M parameter architecture that powers premium commercial TTS — running it entirely offline on a mobile CPU is genuinely pushing the hardware limits.\n\n---\n\n## Installation\n\n### 🎉 Now Live on Google Play!\n\nVoxSherpa TTS is officially available on the **Google Play Store**. No forms, no waitlists — just tap and install.\n\n\u003cdiv align=\"center\"\u003e\n\n\u003ca href=\"https://play.google.com/store/apps/details?id=com.CodeBySonu.VoxSherpa\"\u003e\n  \u003cimg src=\"https://play.google.com/intl/en_us/badges/static/images/badges/en_badge_web_generic.png\" alt=\"Get it on Google Play\" height=\"80\"/\u003e\n\u003c/a\u003e\n\n\u003c/div\u003e\n\n**Requirements:** Android 11+ · ARM64\n\n---\n\n## Changelog\n\n### V2.6 — Media Notification *(Latest)*\n- 🔔 MediaStyle notification with full playback controls\n- 🎚️ Pitch control in System TTS\n- ⚡ Speed control in System TTS\n- Improved performance and stability\n- Bug fixes and optimizations\n- Minor UI improvements\n\n### V2.5 — Stability\n- Bug fixes and stability improvements\n- Improved overall performance\n\n### V2.4 — Bug Fixes\n- Improved System TTS support with better language detection\n- Enhanced UI \u0026 overall app experience\n- Improved compatibility for large screen devices\n- Various bug fixes\n\n### V2.3 — Playback Upgrade\n- Interactive audio seeking\n- New mini player controls\n- Smoother and faster UI performance\n- Fixed cancel generation delay issue\n\n### V2.2 — Core Improvements\n- Regenerate audio on voice change\n- Improved smart punctuation\n- Improved emotion tags\n- Pitch control added\n- Send feedback feature\n- UI/UX improvements\n\n### V1.0 — Foundation\n- Text to Audio\n- Piper (fast models) + Kokoro (high-quality voices)\n- Save audio (.wav) · Favorites support\n- Speed control · Models download · Import Custom Model\n- Chunk-based playback · Smart pause handling\n- System TTS integration · PDF to Audio · TXT to Audio\n\n---\n\n## Model Import (Technical Users)\n\nVoxSherpa supports importing custom `.onnx` models without any server:\n\n1. Place your `.onnx` model + `tokens.txt` on device storage\n2. Open **Models tab** → tap **+** → **Import Local Model**\n3. Select your files\n\nCompatible with any Sherpa-ONNX compatible TTS model.\n\n---\n\n## Contributing\n\nVoxSherpa is open source. Contributions welcome:\n\n- 🐛 Bug reports via [Issues](../../issues)\n- 💡 Feature requests via [Discussions](../../discussions)\n- 🔧 Pull requests for fixes and improvements\n\n---\n\n## License\n\n```\nCopyright (C) 2025 CodeBySonu95\n\nThis program is free software: you can redistribute it and/or modify\nit under the terms of the GNU General Public License as published by\nthe Free Software Foundation, either version 3 of the License, or\n(at your option) any later version.\n\nThis program is distributed in the hope that it will be useful,\nbut WITHOUT ANY WARRANTY; without even the implied warranty of\nMERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the\nGNU General Public License for more details.\n\nhttps://www.gnu.org/licenses/gpl-3.0.html\n```\n\n---\n\n## Acknowledgements\n\n- [k2-fsa/sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx) — the inference engine that makes this possible\n- [hexgrad/Kokoro-82M](https://huggingface.co/hexgrad/Kokoro-82M) — the neural model behind studio-quality synthesis\n- [rhasspy/piper](https://github.com/rhasspy/piper) — fast local TTS engine\n\n---\n\n\u003cdiv align=\"center\"\u003e\n\n**Built with obsession. Runs without internet.**\n\n*VoxSherpa — Because your voice deserves to stay yours.*\n\n\u003c/div\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcodebysonu95%2Fvoxsherpa-tts","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcodebysonu95%2Fvoxsherpa-tts","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcodebysonu95%2Fvoxsherpa-tts/lists"}