https://github.com/marcusrprojects/audiobook-generator
Generate audiobooks from PDFs, EPUBs, and text files using local Piper TTS models.
https://github.com/marcusrprojects/audiobook-generator
audio audiobooks ebook open-source piper python text-to-speech tts
Last synced: about 1 year ago
JSON representation
Generate audiobooks from PDFs, EPUBs, and text files using local Piper TTS models.
- Host: GitHub
- URL: https://github.com/marcusrprojects/audiobook-generator
- Owner: marcusrprojects
- Created: 2025-04-27T00:07:34.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-05-04T23:00:06.000Z (about 1 year ago)
- Last Synced: 2025-05-17T06:11:22.517Z (about 1 year ago)
- Topics: audio, audiobooks, ebook, open-source, piper, python, text-to-speech, tts
- Language: Python
- Homepage:
- Size: 40 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# π Audiobook Generator
[](https://github.com/marcusrprojects/audiobook-generator/releases)
[](https://www.python.org/)
Turn any **PDF**, **EPUB**, or **TXT** file into a natural-sounding **audiobook** using **Piper TTS**.
Everything runs **locally** β no internet connection or server required after setup.
---
## π TL;DR Quick Start
For those familiar with Python development:
```bash
# 1. Clone the repository
git clone https://github.com/marcusrprojects/audiobook-generator.git
cd audiobook-generator
# 2. Create & activate virtual environment (macOS/Linux example)
python3 -m venv .venv && source .venv/bin/activate
# 3. Install Piper TTS packages manually first
pip install piper-tts --no-deps
pip install piper-phonemize-cross
# 4. Install the package and its other dependencies
pip install . # Installs audiobook-generator and requirements.txt deps
# 5. Download default English voice models (requires bash/curl)
bash download_voices.sh
# 6. Generate an audiobook!
audiobook-gen path/to/your_book.epub path/to/output/audio.mp3
```
Windows Users: Use PowerShell/CMD for venv activation (see detailed steps below). For `download_voices.sh`, use WSL2, Git Bash, or download models manually.
---
## π Features
- π Supports **PDF**, **EPUB**, and **plain text** formats
- π£οΈ Multiple **Piper TTS** voices (American and British English)
- βΈοΈ Smart pauses after sentences, commas, and paragraphs
- π οΈ Text normalization (expand numbers, abbreviations)
- π΅ Outputs standard **WAV** or **MP3** files
- π·οΈ Add metadata (title, artist) to MP3 files
- π» 100% offline β no server, no data collection
- βοΈ Simple command-line interface
---
## π Installation
Follow these steps to set up the project and its dependencies.
1. **Clone the repository**:
```bash
git clone https://github.com/marcusrprojects/audiobook-generator.git
cd audiobook-generator
```
2. **Create and activate a virtual environment** (recommended):
```bash
# macOS / Linux
python3 -m venv .venv
source .venv/bin/activate
# Windows (Command Prompt)
python -m venv .venv
.venv\Scripts\activate.bat
# Windows (PowerShell)
python -m venv .venv
.venv\Scripts\Activate.ps1
```
3. **Install Piper TTS Packages Manually**:
*These specific packages often need manual installation first due to their dependencies or naming conventions.*
```bash
pip install piper-tts --no-deps
pip install piper-phonemize-cross
```
4. **Install the remaining dependencies**:
```bash
pip install -r requirements.txt
```
5. **Install the `audiobook-generator` package**:
*This step reads `setup.py` and makes the `audiobook-gen` command available in your active environment.*
```bash
pip install .
```
6. **Download voice models**:
*See the "Downloading Voice Models" section below for details. The easiest way is using the provided script (requires `bash` and `curl`):*
```bash
bash download_voices.sh
```
*(If `bash` is unavailable, see manual download instructions below).*
> **Note:** `requirements.txt` includes all non-Piper dependencies. The two Piper packages must be installed manually due to naming differences.
---
## π₯ Downloading Voice Models
This tool uses **Piper TTS models** (ONNX format) for speech synthesis.
You can download free voice models from [HuggingFace Piper Voices](https://huggingface.co/rhasspy/piper-voices).
### Easy method (recommended)
Run the included script:
```bash
bash download_voices.sh
```
### Manual method (example for Joe)
```bash
mkdir -p voices/en_US/joe-medium
cd voices/en_US/joe-medium
curl -L -o en_US-joe-medium.onnx \
https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/joe/medium/en_US-joe-medium.onnx
curl -L -o en_US-joe-medium.onnx.json \
https://huggingface.co/rhasspy/piper-voices/resolve/main/en/en_US/joe/medium/en_US-joe-medium.onnx.json
cd ../../../..
```
Recommended Voices:
| Voice | Path | Notes |
|:-------------------------------|:--------------------------------------------------|:--------------------------------|
| **LibriTTS R (US, medium)** | `voices/en_US/libritts_r-medium/` | Neutral American narrator |
| **Joe (US, medium)** | `voices/en_US/joe-medium/` | Clear American male |
| **Cori (UK, high quality)** | `voices/en_GB/cori-high/` | High-quality British female |
| **Jenny Dioco (UK, medium)** | `voices/en_GB/jenny_dioco-medium/` | Warm British voice |
---
## π Usage (After Installation via `pip install .`)
Ensure your virtual environment is active (`source .venv/bin/activate` or `.venv\Scripts\activate`). You can then use the audiobook-gen command:
```bash
audiobook-gen input_file.pdf output_file.mp3
```
By default, the **LibriTTS R (US, medium)** voice is used. To specify a different model:
*(Use the path to the desired `.onnx` model file)*
```bash
audiobook-gen input_file.epub output_file.wav \
--model voices/en_GB/cori-high/en_GB-cori-high.onnx
```
Specify input format (EPUB) and output format (WAV):
```bash
audiobook-gen path/to/novel.epub final_audio.wav
```
Add metadata (MP3 only):
```bash
audiobook-gen book.txt audiobook.mp3 \
--title "My Audiobook" --artist "Author Name"
```
List available bundled voice models:
*(Lists voices defined in the script's `AVAILABLE_VOICES` list)*
```bash
audiobook-gen --list-models
```
Show help message:
*(Displays all available command-line options)*
```bash
audiobook-gen --help
```
---
## π Supported Formats
- `.pdf` (scanned PDFs with selectable text)
- `.epub` (standard e-book format)
- `.txt` (plain text files, UTF-8 encoding preferred)
---
## π§ How It Works
- Extracts and normalizes text from your document
- Splits text intelligently by sentences and commas
- Synthesizes natural-sounding speech with pauses
- Outputs ready-to-listen WAV or MP3 files
---
## π Requirements
- Python 3.8 or higher
- ONNX Runtime
- Piper TTS models
- Listed in [`requirements.txt`](requirements.txt) (non-Piper dependencies)
- Manual install for `piper-tts` and `piper-phonemize-cross`
---
## π¦ Creating a Standalone Executable (Optional Alternative)
If you want to distribute this tool to users who might not have Python installed, you can bundle it into a single executable file using [PyInstaller](https://pyinstaller.org/). This is typically done by the developer for distribution.
**Steps to Create the Executable:**
1. Make sure you have followed the Installation steps 1-4 (you need the dependencies installed).
2. Install PyInstaller in your virtual environment:
```bash
pip install pyinstaller
```
3. Run PyInstaller from the project's root directory (`audiobook-generator/`):
```bash
pyinstaller --onefile audiobook_generator.py
```
*This process can take a significant amount of time and requires substantial disk space. It analyzes dependencies and bundles everything. The final executable will be placed inside a new `dist/` folder.*
**Running the Standalone Executable:**
Once built, the executable in the `dist` folder can be run directly without needing Python or the virtual environment.
```bash
# Example on macOS/Linux:
cd dist
./audiobook_generator ../path/to/book.pdf output.mp3 --model ../voices/en_US/joe-medium/en_US-joe-medium.onnx
# Example on Windows:
cd dist
.\audiobook_generator.exe ..\path\to\book.pdf output.mp3 --model ..\voices\en_US\joe-medium\en_US-joe-medium.onnx
```
*Note: Paths to input files and models are relative to where you run the command.*
---
## β¨ Future Plans
- Batch conversion
- Desktop GUI version (Tauri or Electron)
- Additional language support
---
## π© Contact
Open an issue on [GitHub](https://github.com/marcusrprojects/audiobook-generator/issues).
---
## π Let's turn your books into audiobooks!