https://github.com/kyubyong/deepvoice3
Tensorflow Implementation of Deep Voice 3
https://github.com/kyubyong/deepvoice3
Last synced: 20 days ago
JSON representation
Tensorflow Implementation of Deep Voice 3
- Host: GitHub
- URL: https://github.com/kyubyong/deepvoice3
- Owner: Kyubyong
- Created: 2017-10-25T00:00:32.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2018-03-14T13:53:43.000Z (about 7 years ago)
- Last Synced: 2025-03-30T11:07:14.467Z (27 days ago)
- Language: Python
- Homepage:
- Size: 77.1 KB
- Stars: 452
- Watchers: 51
- Forks: 113
- Open Issues: 21
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Deep Voice 3
## **Work In Progress**
To check the current status, see [this](https://github.com/Kyubyong/deepvoice3/issues/9).This is a tensorflow implementation of [DEEP VOICE 3: 2000-SPEAKER NEURAL TEXT-TO-SPEECH](https://arxiv.org/pdf/1710.07654.pdf). For now I'm focusing on single speaker synthesis.
### Data
I'm trying with [Nick Offerman's audiobook files](https://www.audible.com/pd/Fiction/The-Adventures-of-Tom-Sawyer-Audiobook/B01HQMQLWK?source_code=AUDORWS0628169HI5use) for fun and [The LJ Speech Dataset](https://keithito.com/LJ-Speech-Dataset) which in public domain.
## File Description
* hyperparams.py: hyper parameters
* prepro.py: creates inputs and targets, i.e., mel spectrogram, magnitude, and dones.
* data_load.py
* utils.py: several custom operational functions.
* modules.py: building blocks for the networks.
* networks.py: encoder, decoder, and converter
* train.py: train
* synthesize.py: inference
* test_sents.txt: some test sentences in the paper.
## Papers that referenced this repo* [Fitting New Speakers Based on a Short Untranscribed Sample](https://arxiv.org/abs/1802.06984)