Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/seungwonpark/awesome-tts-samples
Awesome list of TTS papers with audio samples
https://github.com/seungwonpark/awesome-tts-samples
List: awesome-tts-samples
awesome tts
Last synced: 3 months ago
JSON representation
Awesome list of TTS papers with audio samples
- Host: GitHub
- URL: https://github.com/seungwonpark/awesome-tts-samples
- Owner: seungwonpark
- License: cc0-1.0
- Archived: true
- Created: 2020-05-26T10:40:19.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2020-08-18T08:09:23.000Z (about 4 years ago)
- Last Synced: 2024-05-23T06:08:18.258Z (6 months ago)
- Topics: awesome, tts
- Homepage:
- Size: 11.7 KB
- Stars: 58
- Watchers: 9
- Forks: 5
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-ai-list-guide - awesome-tts-samples
README
# awesome-tts-samples
List of TTS papers **with audio samples** provided by the authors. The last rows of each paper show the spectrogram inversion (vocoder) being used.
For more comprehensive list of important TTS papers, I recommmend reading [xcmyz/speech-synthesis-paper](https://github.com/xcmyz/speech-synthesis-paper) written by Zhengxi Liu.
## 2020
- [FastPitch](https://arxiv.org/abs/2006.06873) - FastPitch: Parallel Text-to-speech with Pitch Prediction
- https://fastpitch.github.io/
- WaveGlow
- [EATS](https://arxiv.org/abs/2006.03575) - End-to-End Adversarial Text-to-Speech
- https://deepmind.com/research/publications/End-to-End-Adversarial-Text-to-Speech
- End-to-end model
- [Glow-TTS](https://arxiv.org/abs/2005.11129) - Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search
- https://jaywalnut310.github.io/glow-tts-demo
- WaveGlow
- [Flowtron](https://arxiv.org/abs/2005.05957) - Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
- https://nv-adlr.github.io/Flowtron
- WaveGlow## 2019
- [Tacotron2+DCA](https://arxiv.org/abs/1910.10288) - Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis
- https://google.github.io/tacotron/publications/location_relative_attention
- WaveRNN
- [GAN-TTS](https://openreview.net/forum?id=r1gfQgSFDr) - High Fidelity Speech Synthesis with Adversarial Networks
- https://storage.googleapis.com/deepmind-media/research/abstract.wav
- End-to-end model (Built on top of 200Hz linguistic & log pitch features)
- [Multi-lingual Tacotron2](https://arxiv.org/abs/1907.04448) - Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning
- https://google.github.io/tacotron/publications/multilingual
- WaveRNN
- [MelNet](https://arxiv.org/abs/1906.01083) - MelNet: A Generative Model for Audio in the Frequency Domain
- https://audio-samples.github.io
- https://sjvasquez.github.io/blog/melnet
- [Gradient-based spectrogram inversion](https://gist.github.com/carlthome/a4a8bf0f587da738c459d0d5a55695cd)
- [FastSpeech](https://arxiv.org/abs/1905.09263) - FastSpeech: Fast, Robust and Controllable Text to Speech
- https://speechresearch.github.io/fastspeech
- WaveGlow
- [ParaNet](https://arxiv.org/abs/1905.08459) - Parallel Neural Text-to-Speech
- https://parallel-neural-tts-demo.github.io
- WaveVAE, ClariNet, WaveNet## 2018
- [Transformer-TTS](https://arxiv.org/abs/1809.08895) - Neural Speech Synthesis with Transformer Network
- https://neuraltts.github.io/transformertts
- WaveNet
- [Multi-speaker Tacotron2](https://arxiv.org/abs/1806.04558) - Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis
- https://google.github.io/tacotron/publications/speaker_adaptation
- WaveNet
- [Tacotron2+GST](https://arxiv.org/abs/1803.09017) - Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
- https://google.github.io/tacotron/publications/global_style_tokens
- Griffin-Lim## 2017
- [Tacotron2](https://arxiv.org/abs/1712.05884) - Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
- https://google.github.io/tacotron/publications/tacotron2
- WaveNet
- [Tacotron](https://arxiv.org/abs/1703.10135) - Tacotron: Towards End-to-End Speech Synthesis
- https://google.github.io/tacotron/publications/tacotron
- Griffin-Lim# Contributing
TODO