https://github.com/ttop32/coqui_tts_korea

Korean TTS using coqui TTS (glowtts and multiband melgan) - 한국어 TTS
https://github.com/ttop32/coqui_tts_korea

coqui coqui-ai deep-learning glow-tts half-life korea korean korean-language korean-letters korean-text-processing korean-tokenizer korean-tts multiband-melgan pytorch speech speech-synthesis text-to-speech tts vocoder voice-cloning

Last synced: 1 day ago
JSON representation

Korean TTS using coqui TTS (glowtts and multiband melgan) - 한국어 TTS

Host: GitHub
URL: https://github.com/ttop32/coqui_tts_korea
Owner: ttop32
License: mpl-2.0
Created: 2021-12-07T13:42:17.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2022-01-28T10:29:31.000Z (over 3 years ago)
Last Synced: 2025-04-05T12:11:29.365Z (25 days ago)
Topics: coqui, coqui-ai, deep-learning, glow-tts, half-life, korea, korean, korean-language, korean-letters, korean-text-processing, korean-tokenizer, korean-tts, multiband-melgan, pytorch, speech, speech-synthesis, text-to-speech, tts, vocoder, voice-cloning
Language: Jupyter Notebook
Homepage:
Size: 2.79 MB
Stars: 57
Watchers: 3
Forks: 17
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # coqui_tts_korea

Korean TTS using coqui TTS (glowtts and multiband melgan) - 한국어 TTS

- [colab](https://colab.research.google.com/drive/1hv37sT7Pq-qKZe9Ihbbp5XZ-A9tsURli?usp=sharing)

pretrain with KSS data  

finetune HalfLife scientist data  

# Result

- input text 

  - "신은 우리의 수학 문제에는 관심이 없다. 신은 다만 경험적으로 통합할 뿐이다."

- output  

https://user-images.githubusercontent.com/46513852/146552491-76df02ca-870d-4900-ab47-e956ede4cb84.mov

https://user-images.githubusercontent.com/46513852/150669352-d1c0aaf8-915b-498e-8441-32e99651fe1a.mov

# Train detail

- glowtts

  - trained with kss data 190000 step

  - train ipynb file : coqui_train_glowtts.ipynb

  - google drive link : https://drive.google.com/drive/folders/1quLOabjkAmmw6mFbcCsMqmGxMC4bbbCW

  

- multiband-melgan

  - trained with korea concat data (KSS, Zeroth and Pansori-TEDxKR) 150000 step

  - train ipynb file : coqui_train_mbmelgan.ipynb

  - google drive link : https://drive.google.com/drive/folders/1FOlcOjx47j_ALNw28rZkr62iOWqHY6tE 

    

- halfLife finetuned glowtts

  - trained with kss data 190000 step + halfLife 90000 step

  - train ipynb file : halfLife_finetune_glowtts.ipynb

  - google drive link : https://drive.google.com/drive/folders/1RubvJSDKZ_hNp3xj8mCocwtWG3KBmT4R?usp=sharing

- halfLife finetuned multiband-melgan

  - trained with korea concat data (KSS, Zeroth and Pansori-TEDxKR) 150000 step + halfLife 20000 step

  - train ipynb file : halfLife_finetune_mbmelgan.ipynb

  - google drive link : https://drive.google.com/drive/folders/15eAW8jTHSIOAisiPQa03VOMOH-pACguc?usp=sharing

  

    

# Dataset

- [KSS Dataset](https://www.kaggle.com/bryanpark/korean-single-speaker-speech-dataset)

- [Zeroth Korean](https://github.com/goodatlas/zeroth)

- [Pansori-TEDxKR](https://github.com/yc9701/pansori-tedxkr-corpus)

- [half_life_dataset](https://bbs.ruliweb.com/news/board/1003/read/1962882)

# Required Environment to run

```python

!pip install TTS

!pip install jamo

!pip install torchaudio==0.9.0

!pip install gdown

!conda install -c conda-forge kaggle -y

!pip install librosa

```

# Acknowledgement and References 

- [coqui tts](https://github.com/coqui-ai/TTS)

- [TensorFlowTTS](https://github.com/TensorSpeech/TensorFlowTTS)    

- [glow tts](https://arxiv.org/abs/2005.11129) 

- [Multi-band MelGAN](https://arxiv.org/abs/2005.05106)     

- [FastSpeech 2](https://arxiv.org/abs/2006.04558)

- [speech-japanese-korean-vietnamese](http://www.hieuthi.com/blog/2018/04/22/speech-japanese-korean-vietnamese.html)

- [openslr](http://openslr.org/index.html)     

- [half_life_dataset](https://bbs.ruliweb.com/news/board/1003/read/1962882)

- [KSS Dataset](https://www.kaggle.com/bryanpark/korean-single-speaker-speech-dataset)

- [Zeroth Korean](https://github.com/goodatlas/zeroth)

- [Pansori-TEDxKR](https://github.com/yc9701/pansori-tedxkr-corpus)

- [Fine-Tuning with a small dataset](https://github.com/TensorSpeech/TensorFlowTTS/issues/296)

- [Siri를 아이유 목소리로 바꾸기](https://blog.crux.cx/iu-siri-1/)     

- [인공지능 deep voice를 이용한 TTS(음성합성) 구현하기 _ 손석희 앵커](http://melonicedlatte.com/machinelearning/2018/07/02/215933.html)     

- [SCE-TTS: 내 목소리로 TTS 만들기](https://gist.github.com/yunho0130/a97db3296314cd7076d8436238fa113a)  

- [huggingface_fastspeech2_kss](https://huggingface.co/tensorspeech/tts-fastspeech2-kss-ko)

- [huggingface_TensorFlowTTS](https://huggingface.co/spaces/akhaliq/TensorFlowTTS)

- [jamo](https://github.com/JDongian/python-jamo)

- [korean.py](https://github.com/TensorSpeech/TensorFlowTTS/blob/master/tensorflow_tts/utils/korean.py)

- [freeconvert](https://www.freeconvert.com/wav-to-mov)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ttop32/coqui_tts_korea

Awesome Lists containing this project

README