{"id":49028899,"url":"https://github.com/PowerBeef/QwenVoice","last_synced_at":"2026-05-01T03:01:01.488Z","repository":{"id":339845911,"uuid":"1163563766","full_name":"PowerBeef/QwenVoice","owner":"PowerBeef","description":"Vocello is a native macOS app for offline Qwen3-TTS on Apple Silicon, with Custom Voice, Voice Design, and Voice Cloning.","archived":false,"fork":false,"pushed_at":"2026-04-29T21:03:47.000Z","size":14314,"stargazers_count":251,"open_issues_count":20,"forks_count":28,"subscribers_count":4,"default_branch":"main","last_synced_at":"2026-04-29T23:12:48.452Z","etag":null,"topics":["apple-silicon","ios","iphone","macos","mlx","offline","qwen","qwenvoice","swiftui","text-to-speech","tts","vocello","voice-cloning","voice-synthesis"],"latest_commit_sha":null,"homepage":"","language":"Swift","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/PowerBeef.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-02-21T20:22:38.000Z","updated_at":"2026-04-29T22:00:40.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/PowerBeef/QwenVoice","commit_stats":null,"previous_names":["powerbeef/qwenvoice"],"tags_count":17,"template":false,"template_full_name":null,"purl":"pkg:github/PowerBeef/QwenVoice","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PowerBeef%2FQwenVoice","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PowerBeef%2FQwenVoice/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PowerBeef%2FQwenVoice/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PowerBeef%2FQwenVoice/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/PowerBeef","download_url":"https://codeload.github.com/PowerBeef/QwenVoice/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PowerBeef%2FQwenVoice/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32483406,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-30T13:12:12.517Z","status":"online","status_checked_at":"2026-05-01T02:00:05.856Z","response_time":64,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["apple-silicon","ios","iphone","macos","mlx","offline","qwen","qwenvoice","swiftui","text-to-speech","tts","vocello","voice-cloning","voice-synthesis"],"created_at":"2026-04-19T09:00:36.930Z","updated_at":"2026-05-01T03:01:01.481Z","avatar_url":"https://github.com/PowerBeef.png","language":"Swift","funding_links":[],"categories":["LLM \u0026 Inference","Swift"],"sub_categories":[],"readme":"## Screenshots\n\n\u003cimg width=\"1868\" height=\"1676\" alt=\"QwenVoice screenshot\" src=\"https://github.com/user-attachments/assets/311ea30b-9196-4f36-93f4-5db439c5a2ba\" /\u003e\n\n## Overview\n\nQwenVoice is the public repository for an offline Apple-platform Qwen3-TTS app with Custom Voice, Voice Design, and Voice Cloning.\n\nThe currently shipped public release is **QwenVoice v1.2.3** for macOS. The next macOS release ships under the **Vocello** app name while this repository, shared core, and many internal modules keep the QwenVoice identity for continuity.\n\n## Version Matrix\n\n| Surface | Public name | Artifact | Minimum OS / tools | Runtime | Status |\n|---|---|---|---|---|---|\n| Shipped release | QwenVoice v1.2.3 | Assets attached to the v1.2.3 GitHub Release | See the v1.2.3 release notes | SwiftUI app with the shipped legacy runtime | Current public download |\n| Current `main` | QwenVoice repo / Vocello app | Local builds produce `Vocello.app` | macOS 26.0+, iOS 26.0+, Xcode 26.0 | Native Swift/MLX shared core with macOS XPC isolation and iPhone extension isolation | Active development |\n| Next macOS release | Vocello | `Vocello-macos26.dmg` | macOS 26.0+ | Native Swift/MLX macOS runtime hosted out of process | Current release target |\n| iPhone track | Vocello for iPhone | App Store / TestFlight only | iOS 26.0+, iPhone 15 Pro minimum target | 4-bit Speed variants in an engine extension | Maintained but deferred |\n\nThe public landing page remains QwenVoice-led until the Vocello-branded macOS release ships. iPhone is in active development, but it is not a public release surface for the current macOS-first milestone.\n\n## Shipped Modes\n\n### Custom Voice\n\nGenerate speech with the app's built-in English speakers:\n\n- Ryan\n- Aiden\n- Serena\n- Vivian\n\n### Voice Design\n\nVoice Design is a standalone destination. Describe the voice you want, then shape tone before generating.\n\n### Voice Cloning\n\nClone a voice from a short reference clip. The app accepts WAV, MP3, AIFF, M4A, FLAC, and OGG input and can also use an optional transcript for better cloning accuracy. Only clone voices you own or have permission to use.\n\n## What the App Does Not Expose\n\n- no temperature or max-token controls\n- no streaming batch UI\n\nSingle-generation flows use live streaming preview and sidebar playback. Batch generation remains sequential and final-file-based.\n\n## Features\n\n- Native model downloads from Hugging Face\n- Live streaming preview for single generations\n- Local generation history stored in SQLite via GRDB\n- Batch generation for multi-line jobs\n- Sidebar waveform playback UI\n- Configurable output directory and autoplay preference\n- macOS XPC process isolation for native generation on current `main`\n- iPhone engine-extension isolation for the deferred iPhone track\n\n## Requirements\n\n### Current `main`\n\n| Requirement | Detail |\n|---|---|\n| macOS | 26.0+ |\n| iOS | 26.0+ for the maintained iPhone targets |\n| Chip | Apple Silicon |\n| RAM | 8 GB+ on macOS; iPhone 15 Pro is the stated iPhone floor |\n| Tools | Xcode 26.0 and XcodeGen |\n\n### Shipped QwenVoice v1.2.3\n\nThe shipped v1.2.3 build predates the current native `main` release track. Use the v1.2.3 GitHub Release notes and attached assets as the source of truth for that historical build.\n\n## Install from GitHub Releases\n\nDownload the current public release from [Releases](https://github.com/PowerBeef/QwenVoice/releases).\n\nFor the next macOS release, the public artifact is expected to be:\n\n- `Vocello-macos26.dmg`\n\nThen:\n\n1. Open the DMG.\n2. Drag `Vocello.app` to `/Applications`.\n3. Open the app, go to **Models**, download a model, and generate speech.\n\n## Models\n\nStatic model metadata comes from [`Sources/Resources/qwenvoice_contract.json`](Sources/Resources/qwenvoice_contract.json).\n\n| Mode | 8-bit Quality folder | 4-bit Speed folder | Hugging Face repos |\n|---|---|---|---|\n| Custom Voice | `Qwen3-TTS-12Hz-1.7B-CustomVoice-8bit` | `Qwen3-TTS-12Hz-1.7B-CustomVoice-4bit` | `mlx-community/Qwen3-TTS-12Hz-1.7B-CustomVoice-*` |\n| Voice Design | `Qwen3-TTS-12Hz-1.7B-VoiceDesign-8bit` | `Qwen3-TTS-12Hz-1.7B-VoiceDesign-4bit` | `mlx-community/Qwen3-TTS-12Hz-1.7B-VoiceDesign-*` |\n| Voice Cloning | `Qwen3-TTS-12Hz-1.7B-Base-8bit` | `Qwen3-TTS-12Hz-1.7B-Base-4bit` | `mlx-community/Qwen3-TTS-12Hz-1.7B-Base-*` |\n\nmacOS can expose 8-bit Quality where runtime admission allows it and uses 4-bit Speed for constrained hardware. iPhone uses the 4-bit Speed variants only.\n\n## Building from Source\n\nSource-build prerequisites for current `main`:\n\n- macOS 26.0+\n- Apple Silicon\n- Xcode 26.0\n- XcodeGen\n\n```sh\ngit clone https://github.com/PowerBeef/QwenVoice.git\ncd QwenVoice\n./scripts/regenerate_project.sh\nopen QwenVoice.xcodeproj\n```\n\nBuild the `QwenVoice` scheme from Xcode, or use:\n\n```sh\nxcodebuild -project QwenVoice.xcodeproj -scheme QwenVoice build\n```\n\nUseful local checks:\n\n```sh\n./scripts/check_project_inputs.sh\npython3 scripts/harness.py validate\npython3 scripts/harness.py test --layer contract\npython3 scripts/harness.py test --layer swift\npython3 scripts/harness.py test --layer native\n./scripts/build_foundation_targets.sh macos\n./scripts/build_foundation_targets.sh ios\n```\n\nThe harness stores isolated build products and `.xcresult` bundles under `build/harness/`. The macOS UI smoke lane is available with `python3 scripts/harness.py test --layer e2e`; for release signoff on a controlled Mac, run it strictly with `QWENVOICE_E2E_STRICT=1`.\n\nBenchmarks are opt-in:\n\n```sh\npython3 scripts/harness.py bench --category latency --runs 3\npython3 scripts/harness.py bench --category load --runs 3\n```\n\n## Local Release Packaging\n\nFor a local unsigned macOS release build and DMG:\n\n```sh\n./scripts/release.sh --preflight full\n./scripts/verify_release_bundle.sh build/Vocello.app\n./scripts/verify_packaged_dmg.sh build/Vocello-macos26.dmg build/release-metadata.txt\n```\n\n## Tone and Emotion Control\n\nCustom Voice and Voice Design are guided by natural-language instructions rather than SSML-style sliders or markup.\n\nSee [`docs/qwen_tone.md`](docs/qwen_tone.md) for app-oriented guidance on tone and prompt writing.\n\n## Architecture\n\nCurrent `main` uses a native Apple-platform architecture:\n\n- `Sources/` contains the macOS app shell, shared app models/services/views, and the shipping Mac target.\n- `Sources/QwenVoiceCore/` contains shared Apple-platform runtime semantics, contract types, model variants, and iOS extension transport.\n- `Sources/QwenVoiceNative/` contains the macOS app-facing engine proxy/store/client layer.\n- `Sources/QwenVoiceEngineSupport/` contains shared macOS engine IPC and transport types.\n- `Sources/QwenVoiceEngineService/` contains the bundled macOS XPC helper.\n- `Sources/QwenVoiceNativeRuntime/` remains as retained compatibility and regression coverage.\n- `Sources/iOS/`, `Sources/iOSSupport/`, and `Sources/iOSEngineExtension/` contain the deferred iPhone app, support layer, and isolated engine extension.\n- `Sources/SharedSupport/` contains shared playback and generation-persistence surfaces.\n\nThe current codebase does not maintain a repo-owned Python backend, Python setup path, or standalone CLI surface.\n\nDefault macOS runtime data layout:\n\n```text\n~/Library/Application Support/QwenVoice/\n  models/\n  outputs/\n    CustomVoice/\n    VoiceDesign/\n    Clones/\n  voices/\n  history.sqlite\n```\n\nSee [`docs/reference/privacy-storage.md`](docs/reference/privacy-storage.md) for local storage, privacy, and deletion details.\n\n## More Docs\n\n- [`docs/README.md`](docs/README.md) - documentation index\n- [`docs/reference/current-state.md`](docs/reference/current-state.md) - current repo facts\n- [`docs/reference/engineering-status.md`](docs/reference/engineering-status.md) - current strengths and caveats\n- [`docs/reference/release-readiness.md`](docs/reference/release-readiness.md) - macOS-first release policy and signoff gates\n- [`docs/reference/privacy-storage.md`](docs/reference/privacy-storage.md) - local storage, privacy, and deletion details\n- [`CONTRIBUTING.md`](CONTRIBUTING.md) - contributor workflow\n\n## License\n\nQwenVoice is available under the [MIT License](LICENSE).\n\n## Credits\n\nQwenVoice builds on:\n\n- [Qwen3-TTS](https://github.com/QwenLM/Qwen3-TTS)\n- [mlx-audio](https://github.com/Blaizzy/mlx-audio)\n- [MLX](https://github.com/ml-explore/mlx)\n- [GRDB.swift](https://github.com/groue/GRDB.swift)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FPowerBeef%2FQwenVoice","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FPowerBeef%2FQwenVoice","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FPowerBeef%2FQwenVoice/lists"}