{"id":45517860,"url":"https://github.com/elbruno/elbruno.qwentts","last_synced_at":"2026-04-15T18:01:31.411Z","repository":{"id":339880561,"uuid":"1163437974","full_name":"elbruno/ElBruno.QwenTTS","owner":"elbruno","description":"Qwen3-TTS ONNX export pipeline + C# .NET 10 console app for local voice generation","archived":false,"fork":false,"pushed_at":"2026-04-12T19:55:29.000Z","size":2973,"stargazers_count":17,"open_issues_count":0,"forks_count":2,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-12T21:34:22.763Z","etag":null,"topics":["ai","csharp","dotnet","machine-learning","nuget","onnx","onnx-runtime","qwen","qwen3-tts","speech-synthesis","text-to-speech","tts","voice-cloning"],"latest_commit_sha":null,"homepage":null,"language":"C#","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/elbruno.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-21T16:27:42.000Z","updated_at":"2026-04-12T19:55:11.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/elbruno/ElBruno.QwenTTS","commit_stats":null,"previous_names":["elbruno/elbruno.qwentts"],"tags_count":20,"template":false,"template_full_name":null,"purl":"pkg:github/elbruno/ElBruno.QwenTTS","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elbruno%2FElBruno.QwenTTS","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elbruno%2FElBruno.QwenTTS/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elbruno%2FElBruno.QwenTTS/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elbruno%2FElBruno.QwenTTS/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/elbruno","download_url":"https://codeload.github.com/elbruno/ElBruno.QwenTTS/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elbruno%2FElBruno.QwenTTS/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31853279,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-15T15:24:51.572Z","status":"ssl_error","status_checked_at":"2026-04-15T15:24:39.138Z","response_time":63,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","csharp","dotnet","machine-learning","nuget","onnx","onnx-runtime","qwen","qwen3-tts","speech-synthesis","text-to-speech","tts","voice-cloning"],"created_at":"2026-02-22T21:24:17.139Z","updated_at":"2026-04-15T18:01:31.385Z","avatar_url":"https://github.com/elbruno.png","language":"C#","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Qwen3-TTS ONNX Pipeline + C# .NET\n\n[![NuGet](https://img.shields.io/nuget/v/ElBruno.QwenTTS.svg?style=flat-square\u0026logo=nuget)](https://www.nuget.org/packages/ElBruno.QwenTTS)\n[![NuGet Downloads](https://img.shields.io/nuget/dt/ElBruno.QwenTTS.svg?style=flat-square\u0026logo=nuget)](https://www.nuget.org/packages/ElBruno.QwenTTS)\n[![Build Status](https://github.com/elbruno/ElBruno.QwenTTS/actions/workflows/publish.yml/badge.svg)](https://github.com/elbruno/ElBruno.QwenTTS/actions/workflows/publish.yml)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg?style=flat-square)](LICENSE)\n[![GitHub stars](https://img.shields.io/github/stars/elbruno/ElBruno.QwenTTS?style=social)](https://github.com/elbruno/ElBruno.QwenTTS)\n[![Twitter Follow](https://img.shields.io/twitter/follow/elbruno?style=social)](https://twitter.com/elbruno)\n\nRun **Qwen3-TTS** text-to-speech locally from C# using ONNX Runtime — no Python needed at inference time. Models are downloaded automatically on first run.\n\nPre-exported ONNX models are hosted on HuggingFace:\n[**elbruno/Qwen3-TTS-12Hz-0.6B-CustomVoice-ONNX**](https://huggingface.co/elbruno/Qwen3-TTS-12Hz-0.6B-CustomVoice-ONNX) (0.6B preset voices) |\n[**elbruno/Qwen3-TTS-12Hz-1.7B-CustomVoice-ONNX**](https://huggingface.co/elbruno/Qwen3-TTS-12Hz-1.7B-CustomVoice-ONNX) (1.7B preset voices + instruct) |\n[**elbruno/Qwen3-TTS-12Hz-0.6B-Base-ONNX**](https://huggingface.co/elbruno/Qwen3-TTS-12Hz-0.6B-Base-ONNX) (voice cloning)\n\n## Features\n\n- **Local TTS Inference** — Run Qwen3-TTS entirely on your machine using ONNX Runtime\n- **Multi-Model Support** — Choose between 0.6B (lightweight) and 1.7B (advanced instruct control) variants\n- **Automatic Model Download** — Models download from HuggingFace on first run (~5.5 GB for 0.6B, ~10 GB for 1.7B)\n- **Instruct Control** — Natural-language style control with 1.7B model (e.g., \"speak with excitement\", \"whisper softly\")\n- **Multi-Speaker** — 9 built-in voices: ryan, serena, vivian, aiden, eric, dylan, uncle_fu, ono_anna, sohee\n- **Voice Cloning** — Clone any voice from a 3-second audio sample ([docs](docs/voice-cloning.md))\n- **Web UI** — Blazor app with TTS generation and voice cloning pages ([docs](docs/web-app.md))\n- **GPU Acceleration** — Optional CUDA or DirectML support via SessionOptions injection ([docs](docs/gpu-acceleration.md))\n- **Multi-Language** — English, Spanish, Chinese, Japanese, Korean\n- **Shared Model Cache** — Models stored once in `%LOCALAPPDATA%/ElBruno/QwenTTS`, shared across all apps\n- **24 kHz WAV Output** — High-quality mono audio\n\n---\n\n## Quick Start\n\n### Install via NuGet\n\n```bash\ndotnet add package ElBruno.QwenTTS\n```\n\n### Generate speech in C#\n\n```csharp\nusing ElBruno.QwenTTS.Pipeline;\n\n// 0.6B model (default) — models download automatically (~5.5 GB)\nusing var pipeline = await TtsPipeline.CreateAsync(\"models\");\nawait pipeline.SynthesizeAsync(\"Hello world!\", \"ryan\", \"hello.wav\", \"english\");\n\n// 1.7B model — supports instruct control (~10 GB)\nusing var pipeline17 = await TtsPipeline.CreateAsync(\"models\", variant: QwenModelVariant.Qwen17B);\nawait pipeline17.SynthesizeAsync(\"Hello world!\", \"ryan\", \"hello.wav\", \"english\",\n    instruct: \"speak with warmth and excitement\");\n```\n\n### CLI\n\n```bash\n# Default (0.6B model)\ndotnet run --project src/ElBruno.QwenTTS -- --model-dir models --text \"Hello, this is a test.\" --speaker ryan --language english --output hello.wav\n\n# 1.7B model with instruct control\ndotnet run --project src/ElBruno.QwenTTS -- --model-dir models --variant 1.7b --text \"Hello, this is a test.\" --speaker ryan --instruct \"speak with excitement\" --output hello.wav\n```\n\nModels are downloaded automatically if not present in the `--model-dir` directory.\n\n### Voice Cloning\n\nClone any voice from a 3-second audio sample using the `ElBruno.QwenTTS.VoiceCloning` package:\n\n```bash\ndotnet add package ElBruno.QwenTTS.VoiceCloning\n```\n\n```csharp\nusing ElBruno.QwenTTS.VoiceCloning.Pipeline;\n\nvar cloner = await VoiceClonePipeline.CreateAsync();\nawait cloner.SynthesizeAsync(\"Hello world!\", \"reference_speaker.wav\", \"output.wav\", \"english\");\n```\n\nSee [docs/voice-cloning.md](docs/voice-cloning.md) for full documentation.\n\n### GPU Acceleration\n\nPass a `sessionOptionsFactory` to use CUDA or DirectML instead of CPU:\n\n```csharp\nusing ElBruno.QwenTTS.Pipeline;\n\n// CUDA (NVIDIA) — requires Microsoft.ML.OnnxRuntime.Gpu NuGet package\nvar tts = await TtsPipeline.CreateAsync(\n    sessionOptionsFactory: OrtSessionHelper.CreateCudaOptions);\n\n// DirectML (any GPU on Windows) — requires Microsoft.ML.OnnxRuntime.DirectML NuGet package\n// Uses GPU for language model, CPU for vocoder (hybrid mode)\nvar tts = await TtsPipeline.CreateAsync(\n    sessionOptionsFactory: OrtSessionHelper.CreateDirectMlOptions,\n    vocoderSessionOptionsFactory: OrtSessionHelper.CreateCpuOptions);\n```\n\nSee [docs/gpu-acceleration.md](docs/gpu-acceleration.md) for full setup instructions.\n\n## More Examples\n\n```bash\ndotnet run --project src/ElBruno.QwenTTS -- --model-dir models --text \"Welcome to the future of speech synthesis.\" --speaker serena --output welcome.wav\ndotnet run --project src/ElBruno.QwenTTS -- --model-dir models --text \"Speaking with excitement and energy!\" --speaker aiden --variant 1.7b --instruct \"speak with excitement\" --output excited.wav\ndotnet run --project src/ElBruno.QwenTTS -- --model-dir models --text \"A calm and gentle narration.\" --speaker ryan --variant 1.7b --instruct \"speak slowly and calmly\" --output calm.wav\n```\n\n### Spanish Examples\n\n```bash\ndotnet run --project src/ElBruno.QwenTTS -- --model-dir models --text \"Hola, esta es una prueba de texto a voz.\" --speaker ryan --language spanish --output hola.wav\ndotnet run --project src/ElBruno.QwenTTS -- --model-dir models --text \"Bienvenidos al futuro de la sintesis de voz.\" --speaker serena --language spanish --output bienvenidos.wav\n```\n\n### File Reader (batch audio from text/SRT files)\n\n```bash\ndotnet run --project src/ElBruno.QwenTTS.FileReader -- --model-dir models --input samples/hello_demo.txt --speaker ryan --language english --output-dir output/hello\ndotnet run --project src/ElBruno.QwenTTS.FileReader -- --model-dir models --input samples/demo_subtitles.srt --speaker serena --output-dir output/subtitles\n```\n\n### Web App (browser UI)\n\n```bash\ndotnet run --project src/ElBruno.QwenTTS.Web\n```\n\nOpen [http://localhost:5153](http://localhost:5153) — two pages:\n- **🔊 TTS** — type text or upload files, pick a voice, and generate speech\n- **🎭 Voice Clone** — record your voice or upload a WAV, then synthesize with your cloned voice\n\n---\n\n## Documentation\n\n| Document | Description |\n|----------|-------------|\n| [Prerequisites](docs/prerequisites.md) | System requirements (.NET 8+/10, disk space) |\n| [Getting Started](docs/getting-started.md) | Setup, auto-download, and first run |\n| [Core Library](docs/core-library.md) | ElBruno.QwenTTS API reference and usage examples |\n| [CLI Reference](docs/cli-reference.md) | All command options, speakers, and examples |\n| [File Reader](docs/file-reader.md) | Batch audio generation from text and SRT files |\n| [Web App](docs/web-app.md) | Blazor web UI for speech generation |\n| [Architecture](docs/architecture.md) | Pipeline design, model components, project structure |\n| [Exporting Models](docs/exporting-models.md) | Re-exporting ONNX models from PyTorch weights |\n| [Voice Cloning](docs/voice-cloning.md) | Clone any voice from a 3-second reference audio |\n| [GPU Acceleration](docs/gpu-acceleration.md) | CUDA, DirectML, and CPU configuration |\n| [Troubleshooting](docs/troubleshooting.md) | Common issues and fixes |\n| [Detailed Architecture](python/ARCHITECTURE.md) | Full tensor shapes, KV-cache, codebook structure |\n| [Changelog](CHANGELOG.md) | Versioned summary of notable changes |\n\n## Python Tools\n\nThe `python/` directory contains tools for **exporting ONNX models from PyTorch weights** and **downloading models from HuggingFace**. These are only needed if you want to re-export or customize models — they are not required for running the C# pipeline.\n\n---\n\n## Building from Source\n\n```bash\ngit clone https://github.com/elbruno/ElBruno.QwenTTS.git\ncd ElBruno.QwenTTS\ndotnet build\ndotnet test\n```\n\n## Requirements\n\n- .NET 8.0 or .NET 10.0 SDK\n- ONNX Runtime compatible platform (Windows, Linux, macOS)\n- ~5.5 GB disk space for model files\n\n---\n\n## Contributing\n\nContributions are welcome! Here's how to get started:\n\n1. **Fork** the repository\n2. **Create a branch** for your feature or fix: `git checkout -b feature/my-feature`\n3. **Make your changes** and ensure the solution builds: `dotnet build`\n4. **Run tests**: `dotnet test`\n5. **Submit a pull request** with a clear description of the changes\n\nPlease open an issue first for major changes or new features to discuss the approach.\n\n---\n\n## Related Projects\n\n- [**ElBruno.PersonaPlex**](https://github.com/elbruno/ElBruno.PersonaPlex) — NVIDIA PersonaPlex-7B full-duplex speech-to-speech for local C# inference via ONNX Runtime. Pre-exported ONNX models: [elbruno/personaplex-7b-v1-onnx](https://huggingface.co/elbruno/personaplex-7b-v1-onnx)\n\n## References\n\n- [Qwen3-TTS GitHub](https://github.com/QwenLM/Qwen3-TTS)\n- [Original model (PyTorch)](https://huggingface.co/Qwen/Qwen3-TTS-12Hz-0.6B-CustomVoice)\n- [Pre-exported ONNX models](https://huggingface.co/elbruno/Qwen3-TTS-12Hz-0.6B-CustomVoice-ONNX)\n\n---\n\n## 👋 About the Author\n\nHi! I'm **ElBruno** 🧡, a passionate developer and content creator exploring AI, .NET, and modern development practices.\n\n**Made with ❤️ by [ElBruno](https://github.com/elbruno)**\n\nIf you like this project, consider following my work across platforms:\n\n- 📻 **Podcast**: [No Tienen Nombre](https://notienenombre.com) — Spanish-language episodes on AI, development, and tech culture\n- 💻 **Blog**: [ElBruno.com](https://elbruno.com) — Deep dives on embeddings, RAG, .NET, and local AI\n- 📺 **YouTube**: [youtube.com/elbruno](https://www.youtube.com/elbruno) — Demos, tutorials, and live coding\n- 🔗 **LinkedIn**: [@elbruno](https://www.linkedin.com/in/elbruno/) — Professional updates and insights\n- 𝕏 **Twitter**: [@elbruno](https://www.x.com/in/elbruno/) — Quick tips, releases, and tech news\n\n## License\n\nThis project is licensed under the MIT License — see the [LICENSE](LICENSE) file for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Felbruno%2Felbruno.qwentts","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Felbruno%2Felbruno.qwentts","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Felbruno%2Felbruno.qwentts/lists"}