https://github.com/seungwonpark/awesome-tts-samples

Awesome list of TTS papers with audio samples
https://github.com/seungwonpark/awesome-tts-samples

awesome tts

Last synced: 7 months ago
JSON representation

Awesome list of TTS papers with audio samples

Host: GitHub
URL: https://github.com/seungwonpark/awesome-tts-samples
Owner: seungwonpark
License: cc0-1.0
Archived: true
Created: 2020-05-26T10:40:19.000Z (over 5 years ago)
Default Branch: master
Last Pushed: 2020-08-18T08:09:23.000Z (over 5 years ago)
Last Synced: 2025-04-20T05:09:17.203Z (8 months ago)
Topics: awesome, tts
Homepage:
Size: 11.7 KB
Stars: 60
Watchers: 8
Forks: 5
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-ai-list-guide - awesome-tts-samples

README

# awesome-tts-samples

List of TTS papers **with audio samples** provided by the authors. The last rows of each paper show the spectrogram inversion (vocoder) being used.

For more comprehensive list of important TTS papers, I recommmend reading [xcmyz/speech-synthesis-paper](https://github.com/xcmyz/speech-synthesis-paper) written by Zhengxi Liu.

## 2020

- [FastPitch](https://arxiv.org/abs/2006.06873) - FastPitch: Parallel Text-to-speech with Pitch Prediction
- https://fastpitch.github.io/
- WaveGlow
- [EATS](https://arxiv.org/abs/2006.03575) - End-to-End Adversarial Text-to-Speech
- https://deepmind.com/research/publications/End-to-End-Adversarial-Text-to-Speech
- End-to-end model
- [Glow-TTS](https://arxiv.org/abs/2005.11129) - Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search
- https://jaywalnut310.github.io/glow-tts-demo
- WaveGlow
- [Flowtron](https://arxiv.org/abs/2005.05957) - Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
- https://nv-adlr.github.io/Flowtron
- WaveGlow

## 2019
- [Tacotron2+DCA](https://arxiv.org/abs/1910.10288) - Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis
- https://google.github.io/tacotron/publications/location_relative_attention
- WaveRNN
- [GAN-TTS](https://openreview.net/forum?id=r1gfQgSFDr) - High Fidelity Speech Synthesis with Adversarial Networks
- https://storage.googleapis.com/deepmind-media/research/abstract.wav
- End-to-end model (Built on top of 200Hz linguistic & log pitch features)
- [Multi-lingual Tacotron2](https://arxiv.org/abs/1907.04448) - Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning
- https://google.github.io/tacotron/publications/multilingual
- WaveRNN
- [MelNet](https://arxiv.org/abs/1906.01083) - MelNet: A Generative Model for Audio in the Frequency Domain
- https://audio-samples.github.io
- https://sjvasquez.github.io/blog/melnet
- [Gradient-based spectrogram inversion](https://gist.github.com/carlthome/a4a8bf0f587da738c459d0d5a55695cd)
- [FastSpeech](https://arxiv.org/abs/1905.09263) - FastSpeech: Fast, Robust and Controllable Text to Speech
- https://speechresearch.github.io/fastspeech
- WaveGlow
- [ParaNet](https://arxiv.org/abs/1905.08459) - Parallel Neural Text-to-Speech
- https://parallel-neural-tts-demo.github.io
- WaveVAE, ClariNet, WaveNet

## 2018
- [Transformer-TTS](https://arxiv.org/abs/1809.08895) - Neural Speech Synthesis with Transformer Network
- https://neuraltts.github.io/transformertts
- WaveNet
- [Multi-speaker Tacotron2](https://arxiv.org/abs/1806.04558) - Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
- https://google.github.io/tacotron/publications/speaker_adaptation
- WaveNet
- [Tacotron2+GST](https://arxiv.org/abs/1803.09017) - Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
- https://google.github.io/tacotron/publications/global_style_tokens
- Griffin-Lim

## 2017
- [Tacotron2](https://arxiv.org/abs/1712.05884) - Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
- https://google.github.io/tacotron/publications/tacotron2
- WaveNet
- [Tacotron](https://arxiv.org/abs/1703.10135) - Tacotron: Towards End-to-End Speech Synthesis
- https://google.github.io/tacotron/publications/tacotron
- Griffin-Lim

# Contributing

TODO

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/seungwonpark/awesome-tts-samples

Awesome Lists containing this project

README