{"id":25235751,"url":"https://github.com/willwade/convert2applepvoice","last_synced_at":"2025-06-25T23:37:51.878Z","repository":{"id":276692029,"uuid":"929725511","full_name":"willwade/Convert2ApplePVoice","owner":"willwade","description":"Convert/Make a TTS voice with Apple Personal voice from another Voice","archived":false,"fork":false,"pushed_at":"2025-02-10T01:04:09.000Z","size":59,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-05T18:13:00.366Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/willwade.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-02-09T08:36:54.000Z","updated_at":"2025-02-10T01:04:13.000Z","dependencies_parsed_at":"2025-02-09T22:34:21.906Z","dependency_job_id":null,"html_url":"https://github.com/willwade/Convert2ApplePVoice","commit_stats":null,"previous_names":["willwade/convert2applepvoice"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/willwade/Convert2ApplePVoice","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/willwade%2FConvert2ApplePVoice","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/willwade%2FConvert2ApplePVoice/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/willwade%2FConvert2ApplePVoice/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/willwade%2FConvert2ApplePVoice/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/willwade","download_url":"https://codeload.github.com/willwade/Convert2ApplePVoice/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/willwade%2FConvert2ApplePVoice/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261973305,"owners_count":23238548,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-02-11T14:59:22.459Z","updated_at":"2025-06-25T23:37:51.821Z","avatar_url":"https://github.com/willwade.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Convert2ApplePVoice\n\n\n**Warning: I'm not going to pretend here - if you use this it may break licence terms with TTS systems. Not with Apple - but any other provider. So why have we made this? Because we had to. Simply if a person voice banked with one provider and the provider didnt provide a apple Synth Engine (most of them apart from cereproc) then a client is forced to use software which they cant access to have access to their voice. So this way we make a Apple Personal Voice and all AAC apps have can then play this voice.. Well, thats the idea. Your mileage may vary**\n\n\nA macOS automation tool that facilitates the creation of Apple Personal Voice using TTS output from another system. This tool automates the Personal Voice training process by extracting text via OCR and playing it back using TTS.\n\n## Features\n\n- OCR-based text extraction using Apple's Vision framework\n- Extensive TTS engine support (local and cloud-based)\n- Works with Personal Voice's Continuous Recording mode\n- Audio routing to system input via virtual audio device\n- Privacy-focused: runs entirely on-device with no external data transmission\n- Configurable OCR region and TTS settings\n\n## Requirements\n\n- macOS (tested on macOS Sonoma)\n- Python 3.10+\n- Screen Recording permission for OCR functionality\n- BlackHole 2ch or similar virtual audio device\n- For eSpeak support: `brew install espeak-ng`\n- For cloud-based TTS: Valid API credentials\n\n## Installation\n\n1. Install BlackHole for audio routing:\n```bash\nbrew install blackhole-2ch\n```\n\n2. Set up audio routing:\n   - Open System Settings \u003e Sound\n   - Under Input, select \"BlackHole 2ch\"\n   - Under Output, your regular speakers should be selected\n\n3. Clone this repository:\n```bash\ngit clone https://github.com/yourusername/Convert2ApplePVoice.git\ncd Convert2ApplePVoice\n```\n\n4. Create and activate a virtual environment using uv:\n```bash\nuv venv\nsource .venv/bin/activate\n```\n\n5. Install the package in development mode:\n```bash\nuv pip install -e .\n```\n\n6. Grant necessary permissions:\n   - Open System Settings \u003e Privacy \u0026 Security \u003e Screen Recording\n   - Enable permissions for your terminal application\n\n## Audio Setup\n\nThe tool needs to route TTS audio to Personal Voice's input. Here's how to set it up:\n\n1. **Install Virtual Audio Device**:\n   ```bash\n   brew install blackhole-2ch\n   ```\n\n2. **Configure System Audio**:\n   - Open System Settings \u003e Sound\n   - Set Input to \"BlackHole 2ch\"\n   - Set Output to your regular speakers\n\n3. **Optional: Audio Monitoring**\n   To hear the TTS output while it's being recorded:\n   - Install Audio MIDI Setup (if not already installed)\n   - Create a Multi-Output Device:\n     1. Open Audio MIDI Setup\n     2. Click the + button \u003e Create Multi-Output Device\n     3. Check both your speakers and \"BlackHole 2ch\"\n     4. Use this as your system output to hear the TTS\n\n### (Optional)Switching Audio Input Devices\n\nYou can programmatically switch between audio input devices using the `switchaudio-osx` command-line tool:\n\n1. Install the tool:\n```bash\nbrew install switchaudio-osx\n```\n\n2. Switch to BlackHole:\n```bash\nSwitchAudioSource -s \"BlackHole 2ch\" -t input\n```\n\n3. Switch back to built-in microphone:\n```bash\nSwitchAudioSource -s \"MacBook Air Microphone\" -t input\n```\n\nYou can also list all available audio devices:\n```bash\nSwitchAudioSource -a\n```\n\n## Usage\n\n1. Open Personal Voice setup in System Settings\n2. Enable \"Continuous Recording\" mode in Personal Voice\n3. Run the automation script:\n```bash\nPYTHONPATH=src uv run -m convert2applevoice\n```\n\n4. The script will:\n   - Continuously monitor the screen for new phrases\n   - Automatically speak each phrase using the configured TTS engine\n   - Move to the next phrase when speech is detected\n\nPress Ctrl+C to stop the automation.\n\n## Configuration\n\nThe configuration is split into two files in `~/.config/convert2applevoice/`:\n\n### Main Configuration (config.json)\n\n```json\n{\n    \"ocr_region_x\": 100,\n    \"ocr_region_y\": 300,\n    \"ocr_region_width\": 800,\n    \"ocr_region_height\": 100,\n    \"tts_engine\": \"macos\",\n    \"tts_voice\": null,\n    \"tts_rate\": 175,\n    \"tts_volume\": 1.0,\n    \"tts_pitch\": 1.0,\n    \"tts_extra_options\": {},\n    \"ocr_interval\": 0.2,\n    \"retry_delay\": 0.5\n}\n```\n\n### Credentials Configuration (credentials.json)\n\n```json\n{\n    \"aws_key_id\": \"your_aws_key\",\n    \"aws_secret_key\": \"your_aws_secret\",\n    \"aws_region\": \"us-east-1\",\n    \"azure_key\": \"your_azure_key\",\n    \"azure_region\": \"eastus\",\n    \"watson_api_key\": \"your_watson_key\",\n    \"watson_url\": \"your_watson_url\",\n    \"elevenlabs_api_key\": \"your_elevenlabs_key\"\n}\n```\n\n## Supported TTS Engines\n\nThe tool supports multiple TTS engines through py3-tts-wrapper:\n\n### Local Engines\n- `macos`: Built-in macOS TTS (default)\n- `espeak`: Open-source speech synthesizer\n\n### Cloud-based Engines (requires credentials)\n- `polly`: Amazon AWS Polly\n- `azure`: Microsoft Azure TTS\n- `watson`: IBM Watson TTS\n- `elevenlabs`: ElevenLabs TTS\n\n### Engine Features\n\n| Engine | Online/Offline | SSML | Rate/Volume/Pitch | Word Events |\n|--------|---------------|------|-------------------|-------------|\n| macos | Offline | Yes | Yes | No |\n| espeak | Offline | Yes | Yes | Yes |\n| polly | Online | Yes | Yes | Yes |\n| azure | Online | Yes | Yes | Yes |\n| watson | Online | Yes | No | Yes |\n| elevenlabs | Online | No | Yes | Yes |\n\n### Selecting an Engine\n\nTo use a specific engine, set `tts_engine` in your config.json to one of:\n- `\"macos\"` (default)\n- `\"espeak\"`\n- `\"polly\"`\n- `\"azure\"`\n- `\"watson\"`\n- `\"elevenlabs\"`\n\nFor cloud-based engines, make sure to add your credentials to credentials.json.\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n## License\n\nThis project is licensed under the MIT License - see the LICENSE file for details.\n\n## Acknowledgments\n\n- Apple Vision framework for OCR capabilities\n- py3-tts-wrapper for TTS engine support\n- Various TTS service providers\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwillwade%2Fconvert2applepvoice","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwillwade%2Fconvert2applepvoice","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwillwade%2Fconvert2applepvoice/lists"}