https://github.com/kyubyong/deepvoice3

Tensorflow Implementation of Deep Voice 3
https://github.com/kyubyong/deepvoice3

Last synced: 3 months ago
JSON representation

Tensorflow Implementation of Deep Voice 3

Host: GitHub
URL: https://github.com/kyubyong/deepvoice3
Owner: Kyubyong
Created: 2017-10-25T00:00:32.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2018-03-14T13:53:43.000Z (over 7 years ago)
Last Synced: 2025-03-30T11:07:14.467Z (4 months ago)
Language: Python
Homepage:
Size: 77.1 KB
Stars: 452
Watchers: 51
Forks: 113
Open Issues: 21
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Deep Voice 3

## **Work In Progress**
To check the current status, see [this](https://github.com/Kyubyong/deepvoice3/issues/9).

This is a tensorflow implementation of [DEEP VOICE 3: 2000-SPEAKER NEURAL TEXT-TO-SPEECH](https://arxiv.org/pdf/1710.07654.pdf). For now I'm focusing on single speaker synthesis.

### Data

I'm trying with [Nick Offerman's audiobook files](https://www.audible.com/pd/Fiction/The-Adventures-of-Tom-Sawyer-Audiobook/B01HQMQLWK?source_code=AUDORWS0628169HI5use) for fun and [The LJ Speech Dataset](https://keithito.com/LJ-Speech-Dataset) which in public domain.

## File Description

* hyperparams.py: hyper parameters
* prepro.py: creates inputs and targets, i.e., mel spectrogram, magnitude, and dones.
* data_load.py
* utils.py: several custom operational functions.
* modules.py: building blocks for the networks.
* networks.py: encoder, decoder, and converter
* train.py: train
* synthesize.py: inference
* test_sents.txt: some test sentences in the paper.

## Papers that referenced this repo

* [Fitting New Speakers Based on a Short Untranscribed Sample](https://arxiv.org/abs/1802.06984)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kyubyong/deepvoice3

Awesome Lists containing this project

README