https://github.com/ngpepin/tts
Convert a markdown file into an audio narration using VCTK and Coqui-TT
https://github.com/ngpepin/tts
bash coqui-ai coqui-tts docker python3 tts voice-synthesis
Last synced: 6 months ago
JSON representation
Convert a markdown file into an audio narration using VCTK and Coqui-TT
- Host: GitHub
- URL: https://github.com/ngpepin/tts
- Owner: ngpepin
- Created: 2025-04-15T15:36:38.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2025-04-15T16:29:43.000Z (6 months ago)
- Last Synced: 2025-04-15T17:24:10.755Z (6 months ago)
- Topics: bash, coqui-ai, coqui-tts, docker, python3, tts, voice-synthesis
- Language: Shell
- Homepage: https://github.com/ngpepin/TTS
- Size: 10.7 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# TTS
#### Convert a markdown file into an audio narration using VCTK and Coqui-TT[](https://opensource.org/licenses/MIT)
A Bash script that converts Markdown files into narrated audio using **Text-to-Speech (TTS)** via Docker and the **Coqui-TTS** framework.
## Features
- Converts a Markdown (`.md`) file to **MP3 audio**.
- Uses the VCTK **multi-speaker TTS** model with customizable voices.
- Cleans Markdown syntax (headings, links, code blocks, etc.) for more natural-sounding speech.
- Splits large files into manageable chunks for processing.
- Optional **playback** of the generated audio.## Prerequisites
- **Docker** (with GPU support recommended for faster synthesis)
- **Docker Compose**
- **Python 3** (for model invocation and Markdown cleaning)
- **ffmpeg** (for audio processing)
- **mp3wrap** (for merging audio chunks)
- **str** (for string manipulation)## Installation
1. Clone the repository:
``` bash
git clone https://github.com/ngpepin/TTS.git
cd narrate-md2. Make the script executable:
``` bash
chmod +x narrate-md.sh3. (Optional) Place the script in your `PATH` for global access:
``` bash
sudo ln -s $(pwd)/narrate-md.sh /usr/local/bin/narrate-md
```## Usage
``` bash
./narrate-md.sh [-p]
```### Options
``` text
| Flag | Description |
| ---- | ------------------------------------ |
| `-p` | Play the generated audio immediately |
```### Example
``` bash
./narrate-md.sh -p README.md # Converts README.md to README.mp3 and plays it
```## Configuration
Edit these variables in the script to customize behavior:
``` bash
MODEL="tts_models/en/vctk/vits" # TTS model (default: VCTK multi-speaker)
SPEAKER="p230" # Speaker ID (e.g., p230-p260 for VCTK)
MAX_LINES_PER_CHUNK=20 # Split large files into chunks of this size
```## Technical Details
### Pipeline
1. **Markdown Cleaning**:
* Strips headings, links, code blocks, and formatting
* Adds pauses for punctuation (e.g., commas → ",,")
2. **Audio Generation**:
* Uses Coqui-TTS in a Docker container
* Converts text to WAV → MP3 with tempo adjustment
3. **Chunk Handling**:
* Splits large files (>20 lines by default)
* Merges chunks into a single MP3
### Supported Models
* **VCTK**: Multi-speaker English (113 speakers, various accents)
* **LJSpeech**: High-quality single-speaker English (via Tacotron2)
## License
MIT License. See [LICENSE](https://LICENSE) for details.
- - -
> **Warning**\
> This is a work in progress. Error handling is minimal, and results may vary with complex Markdown.