{"id":27498292,"url":"https://github.com/ngpepin/tts","last_synced_at":"2025-04-17T08:30:22.149Z","repository":{"id":288117855,"uuid":"966880496","full_name":"ngpepin/TTS","owner":"ngpepin","description":"Convert a markdown file into an audio narration using VCTK and Coqui-TT","archived":false,"fork":false,"pushed_at":"2025-04-15T16:29:43.000Z","size":11,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-15T17:24:10.755Z","etag":null,"topics":["bash","coqui-ai","coqui-tts","docker","python3","tts","voice-synthesis"],"latest_commit_sha":null,"homepage":"https://github.com/ngpepin/TTS","language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ngpepin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-04-15T15:36:38.000Z","updated_at":"2025-04-15T16:33:11.000Z","dependencies_parsed_at":"2025-04-15T17:24:14.769Z","dependency_job_id":"4c8b27bd-70c8-43d2-9467-7f6d31b3af1d","html_url":"https://github.com/ngpepin/TTS","commit_stats":null,"previous_names":["ngpepin/tts"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ngpepin%2FTTS","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ngpepin%2FTTS/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ngpepin%2FTTS/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ngpepin%2FTTS/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ngpepin","download_url":"https://codeload.github.com/ngpepin/TTS/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249326181,"owners_count":21251735,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bash","coqui-ai","coqui-tts","docker","python3","tts","voice-synthesis"],"created_at":"2025-04-17T08:30:21.491Z","updated_at":"2025-04-17T08:30:22.132Z","avatar_url":"https://github.com/ngpepin.png","language":"Shell","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# TTS \n#### Convert a markdown file into an audio narration using VCTK and Coqui-TT\n\n[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)\n\nA Bash script that converts Markdown files into narrated audio using **Text-to-Speech (TTS)** via Docker and the **Coqui-TTS** framework.\n\n## Features\n\n- Converts a Markdown (`.md`) file to **MP3 audio**.\n- Uses the VCTK **multi-speaker TTS** model with customizable voices.\n- Cleans Markdown syntax (headings, links, code blocks, etc.) for more natural-sounding speech.\n- Splits large files into manageable chunks for processing.\n- Optional **playback** of the generated audio.\n\n## Prerequisites\n\n- **Docker** (with GPU support recommended for faster synthesis)\n- **Docker Compose**\n- **Python 3** (for model invocation and Markdown cleaning)\n- **ffmpeg** (for audio processing)\n- **mp3wrap** (for merging audio chunks)\n- **str** (for string manipulation)\n\n## Installation\n\n1. Clone the repository:\n\n   ``` bash\n   git clone https://github.com/ngpepin/TTS.git\n   cd narrate-md\n\n2. Make the script executable:\n\n   ``` bash\n   chmod +x narrate-md.sh\n\n3. (Optional) Place the script in your `PATH` for global access:\n\n   ``` bash\n   sudo ln -s $(pwd)/narrate-md.sh /usr/local/bin/narrate-md\n   ```\n\n## Usage\n\n``` bash\n   ./narrate-md.sh [-p] \u003cinput.md\u003e\n```\n\n### Options\n\n``` text\n| Flag | Description                          |\n| ---- | ------------------------------------ |\n| `-p` | Play the generated audio immediately |\n```\n\n### Example\n\n``` bash\n./narrate-md.sh -p README.md  # Converts README.md to README.mp3 and plays it\n```\n\n## Configuration\n\nEdit these variables in the script to customize behavior:\n\n``` bash\nMODEL=\"tts_models/en/vctk/vits\"  # TTS model (default: VCTK multi-speaker)\nSPEAKER=\"p230\"                   # Speaker ID (e.g., p230-p260 for VCTK)\nMAX_LINES_PER_CHUNK=20           # Split large files into chunks of this size\n```\n\n## Technical Details\n\n### Pipeline\n\n1. **Markdown Cleaning**:\n\n   * Strips headings, links, code blocks, and formatting\n\n   * Adds pauses for punctuation (e.g., commas → \",,\")\n\n2. **Audio Generation**:\n\n   * Uses Coqui-TTS in a Docker container\n\n   * Converts text to WAV → MP3 with tempo adjustment\n\n3. **Chunk Handling**:\n\n   * Splits large files (\u003e20 lines by default)\n\n   * Merges chunks into a single MP3\n\n### Supported Models\n\n* **VCTK**: Multi-speaker English (113 speakers, various accents)\n\n* **LJSpeech**: High-quality single-speaker English (via Tacotron2)\n\n## License\n\nMIT License. See [LICENSE](https://LICENSE) for details.\n\n- - -\n\n\u003e **Warning**\\\n\u003e This is a work in progress. Error handling is minimal, and results may vary with complex Markdown.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fngpepin%2Ftts","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fngpepin%2Ftts","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fngpepin%2Ftts/lists"}