https://github.com/soobinseo/Tacotron-pytorch

Pytorch implementation of Tacotron
https://github.com/soobinseo/Tacotron-pytorch

pytorch tacotron text-to-speech tts

Last synced: 5 months ago
JSON representation

Pytorch implementation of Tacotron

Host: GitHub
URL: https://github.com/soobinseo/Tacotron-pytorch
Owner: soobinseo
License: apache-2.0
Created: 2017-11-23T07:03:44.000Z (almost 8 years ago)
Default Branch: master
Last Pushed: 2018-11-01T14:58:32.000Z (almost 7 years ago)
Last Synced: 2024-11-14T04:34:33.043Z (11 months ago)
Topics: pytorch, tacotron, text-to-speech, tts
Language: Python
Size: 1.02 MB
Stars: 206
Watchers: 9
Forks: 41
Open Issues: 6
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

Awesome-pytorch-list-CNVersion - Tacotron-pytorch
Awesome-pytorch-list - Tacotron-pytorch - to-End Speech Synthesis. (Pytorch & related libraries / NLP & Speech Processing:)

README

# Tacotron-pytorch

A pytorch implementation of [Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model](https://arxiv.org/abs/1703.10135).

## Requirements
* Install python 3
* Install pytorch == 0.2.0
* Install requirements:
```
pip install -r requirements.txt
```

## Data
I used LJSpeech dataset which consists of pairs of text script and wav files. The complete dataset (13,100 pairs) can be downloaded [here](https://keithito.com/LJ-Speech-Dataset/). I referred https://github.com/keithito/tacotron for the preprocessing code.

## File description
* `hyperparams.py` includes all hyper parameters that are needed.
* `data.py` loads training data and preprocess text to index and wav files to spectrogram. Preprocessing codes for text is in text/ directory.
* `module.py` contains all methods, including CBHG, highway, prenet, and so on.
* `network.py` contains networks including encoder, decoder and post-processing network.
* `train.py` is for training.
* `synthesis.py` is for generating TTS sample.

## Training the network
* STEP 1. Download and extract LJSpeech data at any directory you want.
* STEP 2. Adjust hyperparameters in `hyperparams.py`, especially 'data_path' which is a directory that you extract files, and the others if necessary.
* STEP 3. Run `train.py`.

## Generate TTS wav file
* STEP 1. Run `synthesis.py`. Make sure the restore step.

## Samples
* You can check the generated samples in 'samples/' directory. Training step was only 60K, so the performance is not good yet.

## Reference
* Keith ito: https://github.com/keithito/tacotron

## Comments
* Any comments for the codes are always welcome.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/soobinseo/Tacotron-pytorch

Awesome Lists containing this project

README