{"id":13583913,"url":"https://github.com/kan-bayashi/ParallelWaveGAN","last_synced_at":"2025-04-06T21:33:27.235Z","repository":{"id":37502449,"uuid":"218188149","full_name":"kan-bayashi/ParallelWaveGAN","owner":"kan-bayashi","description":"Unofficial Parallel WaveGAN (+ MelGAN \u0026 Multi-band MelGAN \u0026 HiFi-GAN \u0026 StyleMelGAN) with Pytorch","archived":false,"fork":false,"pushed_at":"2024-04-22T02:36:29.000Z","size":36450,"stargazers_count":1556,"open_issues_count":43,"forks_count":341,"subscribers_count":46,"default_branch":"master","last_synced_at":"2024-10-29T14:51:41.352Z","etag":null,"topics":["hifigan","melgan","neural-vocoder","parallel-wavenet","pytorch","realtime","speech-synthesis","style-melgan","text-to-speech","tts","vocoder","wavenet"],"latest_commit_sha":null,"homepage":"https://kan-bayashi.github.io/ParallelWaveGAN/","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kan-bayashi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":"kan-bayashi"}},"created_at":"2019-10-29T02:32:35.000Z","updated_at":"2024-10-29T03:30:34.000Z","dependencies_parsed_at":"2024-06-18T17:06:14.980Z","dependency_job_id":"67a83ce8-70f0-4818-9387-0b29d4e9a1a8","html_url":"https://github.com/kan-bayashi/ParallelWaveGAN","commit_stats":{"total_commits":994,"total_committers":17,"mean_commits":"58.470588235294116","dds":0.05030181086519114,"last_synced_commit":"4a6082932fd09c91de64ac0433734882a5e4a47c"},"previous_names":[],"tags_count":19,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kan-bayashi%2FParallelWaveGAN","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kan-bayashi%2FParallelWaveGAN/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kan-bayashi%2FParallelWaveGAN/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kan-bayashi%2FParallelWaveGAN/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kan-bayashi","download_url":"https://codeload.github.com/kan-bayashi/ParallelWaveGAN/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223265006,"owners_count":17116265,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["hifigan","melgan","neural-vocoder","parallel-wavenet","pytorch","realtime","speech-synthesis","style-melgan","text-to-speech","tts","vocoder","wavenet"],"created_at":"2024-08-01T15:03:53.610Z","updated_at":"2024-11-06T00:30:58.188Z","avatar_url":"https://github.com/kan-bayashi.png","language":"Jupyter Notebook","funding_links":["https://github.com/sponsors/kan-bayashi"],"categories":["Jupyter Notebook","语音合成"],"sub_categories":["网络服务_其他"],"readme":"# Parallel WaveGAN implementation with Pytorch\n\n![](https://github.com/kan-bayashi/ParallelWaveGAN/workflows/CI/badge.svg) [![](https://img.shields.io/pypi/v/parallel-wavegan)](https://pypi.org/project/parallel-wavegan/) ![](https://img.shields.io/pypi/pyversions/parallel-wavegan) ![](https://img.shields.io/pypi/l/parallel-wavegan) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/espnet/notebook/blob/master/espnet2_tts_realtime_demo.ipynb)\n\nThis repository provides **UNOFFICIAL** pytorch implementations of the following models:\n- [Parallel WaveGAN](https://arxiv.org/abs/1910.11480)\n- [MelGAN](https://arxiv.org/abs/1910.06711)\n- [Multiband-MelGAN](https://arxiv.org/abs/2005.05106)\n- [HiFi-GAN](https://arxiv.org/abs/2010.05646)\n- [StyleMelGAN](https://arxiv.org/abs/2011.01557)\n\nYou can combine these state-of-the-art non-autoregressive models to build your own great vocoder!\n\nPlease check our samples in [our demo HP](https://kan-bayashi.github.io/ParallelWaveGAN).\n\n![](https://user-images.githubusercontent.com/22779813/68081503-4b8fcf00-fe52-11e9-8791-e02851220355.png)\n\n\u003e Source of the figure: https://arxiv.org/pdf/1910.11480.pdf\n\nThe goal of this repository is to provide real-time neural vocoder, which is compatible with [ESPnet-TTS](https://github.com/espnet/espnet).  \nAlso, this repository can be combined with [NVIDIA/tacotron2](https://github.com/NVIDIA/tacotron2)-based implementation (See [this comment](https://github.com/kan-bayashi/ParallelWaveGAN/issues/169#issuecomment-649320778)).\n\nYou can try the real-time end-to-end text-to-speech and singing voice synthesis demonstration in Google Colab!\n- Real-time demonstration with ESPnet2  [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/espnet/notebook/blob/master/espnet2_tts_realtime_demo.ipynb)\n- Real-time demonstration with ESPnet1  [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/espnet/notebook/blob/master/tts_realtime_demo.ipynb)\n- Real-time demonstration with Muskits [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/SJTMusicTeam/svs_demo/blob/master/muskit_svs_realtime.ipynb)\n\n## What's new\n\n- 2023/08/17 [LibriTTS-R recipe](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/libritts_r/voc1) is available!\n- 2022/02/27 Support singing voice vocoder [egs/{kiritan, opencpop, oniku\\_kurumi\\_utagoe\\_db, ofuton\\_p\\_utagoe\\_db, csd, kising}/voc1]\n- 2021/10/21 Single-speaker Korean recipe [egs/kss/voc1] is available.\n- 2021/08/24 Add more pretrained models of StyleMelGAN and HiFi-GAN.\n- 2021/08/07 Add initial pretrained models of StyleMelGAN and HiFi-GAN.\n- 2021/08/03 Support [StyleMelGAN](https://arxiv.org/abs/2011.01557) generator and discriminator!\n- 2021/08/02 Support [HiFi-GAN](https://arxiv.org/abs/2010.05646) generator and discriminator!\n- 2020/10/07 [JSSS](https://sites.google.com/site/shinnosuketakamichi/research-topics/jsss_corpus) recipe is available!\n- 2020/08/19 [Real-time demo with ESPnet2](https://colab.research.google.com/github/espnet/notebook/blob/master/espnet2_tts_realtime_demo.ipynb) is available!\n- 2020/05/29 [VCTK, JSUT, and CSMSC multi-band MelGAN pretrained model](#Results) is available!\n- 2020/05/27 [New LJSpeech multi-band MelGAN pretrained model](#Results) is available!\n- 2020/05/24 [LJSpeech full-band MelGAN pretrained model](#Results) is available!\n- 2020/05/22 [LJSpeech multi-band MelGAN pretrained model](#Results) is available!\n- 2020/05/16 [Multi-band MelGAN](https://arxiv.org/abs/2005.05106) is available!\n- 2020/03/25 [LibriTTS pretrained models](#Results) are available!\n- 2020/03/17 [Tensorflow conversion example notebook](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/notebooks/convert_melgan_from_pytorch_to_tensorflow.ipynb) is available (Thanks, [@dathudeptrai](https://github.com/dathudeptrai))!\n- 2020/03/16 [LibriTTS recipe](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/libritts/voc1) is available!\n- 2020/03/12 [PWG G + MelGAN D + STFT-loss samples](#Results) are available!\n- 2020/03/12 Multi-speaker English recipe [egs/vctk/voc1](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/vctk/voc1) is available!\n- 2020/02/22 [MelGAN G + MelGAN D + STFT-loss samples](#Results) are available!\n- 2020/02/12 Support [MelGAN](https://arxiv.org/abs/1910.06711)'s discriminator!\n- 2020/02/08 Support [MelGAN](https://arxiv.org/abs/1910.06711)'s generator!\n\n## Requirements\n\nThis repository is tested on Ubuntu 20.04 with a GPU Titan V.\n\n- Python 3.8+\n- Cuda 11.0+\n- CuDNN 8+\n- NCCL 2+ (for distributed multi-gpu training)\n- libsndfile (you can install via `sudo apt install libsndfile-dev` in ubuntu)\n- jq (you can install via `sudo apt install jq` in ubuntu)\n- sox (you can install via `sudo apt install sox` in ubuntu)\n\nDifferent cuda version should be working but not explicitly tested.  \nAll of the codes are tested on Pytorch 1.8.1, 1.9, 1.10.2, 1.11.0, 1.12.1, 1.13.1, 2.0.1 and 2.1.0.\n\n## Setup\n\nYou can select the installation method from two alternatives.\n\n### A. Use pip\n\n```bash\n$ git clone https://github.com/kan-bayashi/ParallelWaveGAN.git\n$ cd ParallelWaveGAN\n$ pip install -e .\n# If you want to use distributed training, please install\n# apex manually by following https://github.com/NVIDIA/apex\n$ ...\n```\nNote that your cuda version must be exactly matched with the version used for the pytorch binary to install apex.  \nTo install pytorch compiled with different cuda version, see `tools/Makefile`.\n\n### B. Make virtualenv\n\n```bash\n$ git clone https://github.com/kan-bayashi/ParallelWaveGAN.git\n$ cd ParallelWaveGAN/tools\n$ make\n# If you want to use distributed training, please run following\n# command to install apex.\n$ make apex\n```\n\nNote that we specify cuda version used to compile pytorch wheel.  \nIf you want to use different cuda version, please check `tools/Makefile` to change the pytorch wheel to be installed.\n\n## Recipe\n\nThis repository provides [Kaldi](https://github.com/kaldi-asr/kaldi)-style recipes, as the same as [ESPnet](https://github.com/espnet/espnet).  \nCurrently, the following recipes are supported.\n\n- [LJSpeech](https://keithito.com/LJ-Speech-Dataset/): English female speaker\n- [JSUT](https://sites.google.com/site/shinnosuketakamichi/publication/jsut): Japanese female speaker\n- [JSSS](https://sites.google.com/site/shinnosuketakamichi/research-topics/jsss_corpus): Japanese female speaker\n- [CSMSC](https://www.data-baker.com/open_source.html): Mandarin female speaker\n- [CMU Arctic](http://www.festvox.org/cmu_arctic/): English speakers\n- [JNAS](http://research.nii.ac.jp/src/en/JNAS.html): Japanese multi-speaker\n- [VCTK](https://homepages.inf.ed.ac.uk/jyamagis/page3/page58/page58.html): English multi-speaker\n- [LibriTTS](https://arxiv.org/abs/1904.02882): English multi-speaker\n- [LibriTTS-R](https://arxiv.org/abs/2305.18802): English multi-speaker enhanced by speech restoration.\n- [YesNo](https://arxiv.org/abs/1904.02882): English speaker (For debugging)\n- [KSS](https://www.kaggle.com/bryanpark/korean-single-speaker-speech-dataset): Single Korean female speaker\n- [Oniku\\_kurumi\\_utagoe\\_db/](http://onikuru.info/db-download/): Single Japanese female singer (singing voice)\n- [Kiritan](https://zunko.jp/kiridev/login.php): Single Japanese male singer (singing voice)\n- [Ofuton\\_p\\_utagoe\\_db](https://sites.google.com/view/oftn-utagoedb/%E3%83%9B%E3%83%BC%E3%83%A0): Single Japanese female singer (singing voice)\n- [Opencpop](https://wenet.org.cn/opencpop/download/): Single Mandarin female singer (singing voice)\n- [CSD](https://zenodo.org/record/4785016/): Single Korean/English female singer (singing voice)\n- [KiSing](http://shijt.site/index.php/2021/05/16/kising-the-first-open-source-mandarin-singing-voice-synthesis-corpus/): Single Mandarin female singer (singing voice)\n\nTo run the recipe, please follow the below instruction.\n\n```bash\n# Let us move on the recipe directory\n$ cd egs/ljspeech/voc1\n\n# Run the recipe from scratch\n$ ./run.sh\n\n# You can change config via command line\n$ ./run.sh --conf \u003cyour_customized_yaml_config\u003e\n\n# You can select the stage to start and stop\n$ ./run.sh --stage 2 --stop_stage 2\n\n# If you want to specify the gpu\n$ CUDA_VISIBLE_DEVICES=1 ./run.sh --stage 2\n\n# If you want to resume training from 10000 steps checkpoint\n$ ./run.sh --stage 2 --resume \u003cpath\u003e/\u003cto\u003e/checkpoint-10000steps.pkl\n```\n\nSee more info about the recipes in [this README](./egs/README.md).\n\n## Speed\n\nThe decoding speed is RTF = 0.016 with TITAN V, much faster than the real-time.\n\n```bash\n[decode]: 100%|██████████| 250/250 [00:30\u003c00:00,  8.31it/s, RTF=0.0156]\n2019-11-03 09:07:40,480 (decode:127) INFO: finished generation of 250 utterances (RTF = 0.016).\n```\n\nEven on the CPU (Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz 16 threads), it can generate less than the real-time.\n\n```bash\n[decode]: 100%|██████████| 250/250 [22:16\u003c00:00,  5.35s/it, RTF=0.841]\n2019-11-06 09:04:56,697 (decode:129) INFO: finished generation of 250 utterances (RTF = 0.734).\n```\n\nIf you use MelGAN's generator, the decoding speed will be further faster.\n\n```bash\n# On CPU (Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz 16 threads)\n[decode]: 100%|██████████| 250/250 [04:00\u003c00:00,  1.04it/s, RTF=0.0882]\n2020-02-08 10:45:14,111 (decode:142) INFO: Finished generation of 250 utterances (RTF = 0.137).\n\n# On GPU (TITAN V)\n[decode]: 100%|██████████| 250/250 [00:06\u003c00:00, 36.38it/s, RTF=0.00189]\n2020-02-08 05:44:42,231 (decode:142) INFO: Finished generation of 250 utterances (RTF = 0.002).\n```\n\nIf you use Multi-band MelGAN's generator, the decoding speed will be much further faster.\n\n```bash\n# On CPU (Intel(R) Xeon(R) Gold 6154 CPU @ 3.00GHz 16 threads)\n[decode]: 100%|██████████| 250/250 [01:47\u003c00:00,  2.95it/s, RTF=0.048]\n2020-05-22 15:37:19,771 (decode:151) INFO: Finished generation of 250 utterances (RTF = 0.059).\n\n# On GPU (TITAN V)\n[decode]: 100%|██████████| 250/250 [00:05\u003c00:00, 43.67it/s, RTF=0.000928]\n2020-05-22 15:35:13,302 (decode:151) INFO: Finished generation of 250 utterances (RTF = 0.001).\n```\n\nIf you want to accelerate the inference more, it is worthwhile to try the conversion from pytorch to tensorflow.  \nThe example of the conversion is available in [the notebook](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/notebooks/convert_melgan_from_pytorch_to_tensorflow.ipynb) (Provided by [@dathudeptrai](https://github.com/dathudeptrai)).  \n\n## Results\n\nHere the results are summarized in the table.  \nYou can listen to the samples and download pretrained models from the link to our google drive.\n\n| Model                                                                                                        | Conf                                                                                                                        | Lang  | Fs [Hz] | Mel range [Hz] | FFT / Hop / Win [pt] | # iters |\n| :----------------------------------------------------------------------------------------------------------- | :-------------------------------------------------------------------------------------------------------------------------: | :---: | :-----: | :------------: | :------------------: | :-----: |\n| [ljspeech_parallel_wavegan.v1](https://drive.google.com/open?id=1wdHr1a51TLeo4iKrGErVKHVFyq6D17TU)           | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/ljspeech/voc1/conf/parallel_wavegan.v1.yaml)          | EN    | 22.05k  | 80-7600        | 1024 / 256 / None    | 400k    |\n| [ljspeech_parallel_wavegan.v1.long](https://drive.google.com/open?id=1XRn3s_wzPF2fdfGshLwuvNHrbgD0hqVS)      | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/ljspeech/voc1/conf/parallel_wavegan.v1.long.yaml)     | EN    | 22.05k  | 80-7600        | 1024 / 256 / None    | 1M      |\n| [ljspeech_parallel_wavegan.v1.no_limit](https://drive.google.com/open?id=1NoD3TCmKIDHHtf74YsScX8s59aZFOFJA)  | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/ljspeech/voc1/conf/parallel_wavegan.v1.no_limit.yaml) | EN    | 22.05k  | None           | 1024 / 256 / None    | 400k    |\n| [ljspeech_parallel_wavegan.v3](https://drive.google.com/open?id=1a5Q2KiJfUQkVFo5Bd1IoYPVicJGnm7EL)           | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/ljspeech/voc1/conf/parallel_wavegan.v3.yaml)          | EN    | 22.05k  | 80-7600        | 1024 / 256 / None    | 3M      |\n| [ljspeech_melgan.v1](https://drive.google.com/open?id=1z0vO1UMFHyeCdCLAmd7Moewi4QgCb07S)                     | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/ljspeech/voc1/conf/melgan.v1.yaml)                    | EN    | 22.05k  | 80-7600        | 1024 / 256 / None    | 400k    |\n| [ljspeech_melgan.v1.long](https://drive.google.com/open?id=1RqNGcFO7Geb6-4pJtMbC9-ph_WiWA14e)                | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/ljspeech/voc1/conf/melgan.v1.long.yaml)               | EN    | 22.05k  | 80-7600        | 1024 / 256 / None    | 1M      |\n| [ljspeech_melgan_large.v1](https://drive.google.com/open?id=1KQt-gyxbG6iTZ4aVn9YjQuaGYjAleYs8)               | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/ljspeech/voc1/conf/melgan_large.v1.yaml)              | EN    | 22.05k  | 80-7600        | 1024 / 256 / None    | 400k    |\n| [ljspeech_melgan_large.v1.long](https://drive.google.com/open?id=1ogEx-wiQS7HVtdU0_TmlENURIe4v2erC)          | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/ljspeech/voc1/conf/melgan_large.v1.long.yaml)         | EN    | 22.05k  | 80-7600        | 1024 / 256 / None    | 1M      |\n| [ljspeech_melgan.v3](https://drive.google.com/open?id=1eXkm_Wf1YVlk5waP4Vgqd0GzMaJtW3y5)                     | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/ljspeech/voc1/conf/melgan.v3.yaml)                    | EN    | 22.05k  | 80-7600        | 1024 / 256 / None    | 2M      |\n| [ljspeech_melgan.v3.long](https://drive.google.com/open?id=1u1w4RPefjByX8nfsL59OzU2KgEksBhL1)                | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/ljspeech/voc1/conf/melgan.v3.long.yaml)               | EN    | 22.05k  | 80-7600        | 1024 / 256 / None    | 4M      |\n| [ljspeech_full_band_melgan.v1](https://drive.google.com/open?id=1RQqkbnoow0srTDYJNYA7RJ5cDRC5xB-t)           | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/ljspeech/voc1/conf/full_band_melgan.v1.yaml)          | EN    | 22.05k  | 80-7600        | 1024 / 256 / None    | 1M      |\n| [ljspeech_full_band_melgan.v2](https://drive.google.com/open?id=1d9DWOzwOyxT1K5lPnyMqr2nED62vlHaX)           | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/ljspeech/voc1/conf/full_band_melgan.v2.yaml)          | EN    | 22.05k  | 80-7600        | 1024 / 256 / None    | 1M      |\n| [ljspeech_multi_band_melgan.v1](https://drive.google.com/open?id=1ls_YxCccQD-v6ADbG6qXlZ8f30KrrhLT)          | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/ljspeech/voc1/conf/multi_band_melgan.v1.yaml)         | EN    | 22.05k  | 80-7600        | 1024 / 256 / None    | 1M      |\n| [ljspeech_multi_band_melgan.v2](https://drive.google.com/open?id=1wevYP2HQ7ec2fSixTpZIX0sNBtYZJz_I)          | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/ljspeech/voc1/conf/multi_band_melgan.v2.yaml)         | EN    | 22.05k  | 80-7600        | 1024 / 256 / None    | 1M      |\n| [ljspeech_hifigan.v1](https://drive.google.com/open?id=18_R5-pGHDIbIR1QvrtBZwVRHHpBy5xiZ)                    | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/ljspeech/voc1/conf/hifigan.v1.yaml)                   | EN    | 22.05k  | 80-7600        | 1024 / 256 / None    | 2.5M    |\n| [ljspeech_style_melgan.v1](https://drive.google.com/open?id=1WFlVknhyeZhTT5R6HznVJCJ4fwXKtb3B)               | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/ljspeech/voc1/conf/style_melgan.v1.yaml)              | EN    | 22.05k  | 80-7600        | 1024 / 256 / None    | 1.5M    |\n| [jsut_parallel_wavegan.v1](https://drive.google.com/open?id=1UDRL0JAovZ8XZhoH0wi9jj_zeCKb-AIA)               | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/jsut/voc1/conf/parallel_wavegan.v1.yaml)              | JP    | 24k     | 80-7600        | 2048 / 300 / 1200    | 400k    |\n| [jsut_multi_band_melgan.v2](https://drive.google.com/open?id=1E4fe0c5gMLtmSS0Hrzj-9nUbMwzke4PS)              | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/jsut/voc1/conf/multi_band_melgan.v2.yaml)             | JP    | 24k     | 80-7600        | 2048 / 300 / 1200    | 1M      |\n| [just_hifigan.v1](https://drive.google.com/open?id=1TY88141UWzQTAQXIPa8_g40QshuqVj6Y)                        | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/jsut/voc1/conf/hifigan.v1.yaml)                       | JP    | 24k     | 80-7600        | 2048 / 300 / 1200    | 2.5M    |\n| [just_style_melgan.v1](https://drive.google.com/open?id=1-qKAC0zLya6iKMngDERbSzBYD4JHmGdh)                   | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/jsut/voc1/conf/style_melgan.v1.yaml)                  | JP    | 24k     | 80-7600        | 2048 / 300 / 1200    | 1.5M    |\n| [csmsc_parallel_wavegan.v1](https://drive.google.com/open?id=1C2nu9nOFdKcEd-D9xGquQ0bCia0B2v_4)              | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/csmsc/voc1/conf/parallel_wavegan.v1.yaml)             | ZH    | 24k     | 80-7600        | 2048 / 300 / 1200    | 400k    |\n| [csmsc_multi_band_melgan.v2](https://drive.google.com/open?id=1F7FwxGbvSo1Rnb5kp0dhGwimRJstzCrz)             | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/csmsc/voc1/conf/multi_band_melgan.v2.yaml)            | ZH    | 24k     | 80-7600        | 2048 / 300 / 1200    | 1M      |\n| [csmsc_hifigan.v1](https://drive.google.com/open?id=1gTkVloMqteBfSRhTrZGdOBBBRsGd3qt8)                       | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/csmsc/voc1/conf/hifigan.v1.yaml)                      | ZH    | 24k     | 80-7600        | 2048 / 300 / 1200    | 2.5M    |\n| [csmsc_style_melgan.v1](https://drive.google.com/open?id=1gl4P5W_ST_nnv0vjurs7naVm5UJqkZIn)                  | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/csmsc/voc1/conf/style_melgan.v1.yaml)                 | ZH    | 24k     | 80-7600        | 2048 / 300 / 1200    | 1.5M    |\n| [arctic_slt_parallel_wavegan.v1](https://drive.google.com/open?id=1xG9CmSED2TzFdklD6fVxzf7kFV2kPQAJ)         | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/arctic/voc1/conf/parallel_wavegan.v1.yaml)            | EN    | 16k     | 80-7600        | 1024 / 256 / None    | 400k    |\n| [jnas_parallel_wavegan.v1](https://drive.google.com/open?id=1n_hkxPxryVXbp6oHM1NFm08q0TcoDXz1)               | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/jnas/voc1/conf/parallel_wavegan.v1.yaml)              | JP    | 16k     | 80-7600        | 1024 / 256 / None    | 400k    |\n| [vctk_parallel_wavegan.v1](https://drive.google.com/open?id=1dGTu-B7an2P5sEOepLPjpOaasgaSnLpi)               | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/vctk/voc1/conf/parallel_wavegan.v1.yaml)              | EN    | 24k     | 80-7600        | 2048 / 300 / 1200    | 400k    |\n| [vctk_parallel_wavegan.v1.long](https://drive.google.com/open?id=1qoocM-VQZpjbv5B-zVJpdraazGcPL0So)          | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/vctk/voc1/conf/parallel_wavegan.v1.long.yaml)         | EN    | 24k     | 80-7600        | 2048 / 300 / 1200    | 1M      |\n| [vctk_multi_band_melgan.v2](https://drive.google.com/open?id=17EkB4hSKUEDTYEne-dNHtJT724hdivn4)              | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/vctk/voc1/conf/multi_band_melgan.v2.yaml)             | EN    | 24k     | 80-7600        | 2048 / 300 / 1200    | 1M      |\n| [vctk_hifigan.v1](https://drive.google.com/open?id=17fu7ukS97m-8StXPc6ltW8a3hr0fsQBP)                        | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/vctk/voc1/conf/hifigan.v1.yaml)                       | EN    | 24k     | 80-7600        | 2048 / 300 / 1200    | 2.5M    |\n| [vctk_style_melgan.v1](https://drive.google.com/open?id=1kfJgzDgrOFYxTfVTNbTHcnyq--cc6plo)                   | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/vctk/voc1/conf/style_melgan.v1.yaml)                  | EN    | 24k     | 80-7600        | 2048 / 300 / 1200    | 1.5M    |\n| [libritts_parallel_wavegan.v1](https://drive.google.com/open?id=1pb18Nd2FCYWnXfStszBAEEIMe_EZUJV0)           | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/libritts/voc1/conf/parallel_wavegan.v1.yaml)          | EN    | 24k     | 80-7600        | 2048 / 300 / 1200    | 400k    |\n| [libritts_parallel_wavegan.v1.long](https://drive.google.com/open?id=15ibzv-uTeprVpwT946Hl1XUYDmg5Afwz)      | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/libritts/voc1/conf/parallel_wavegan.v1.long.yaml)     | EN    | 24k     | 80-7600        | 2048 / 300 / 1200    | 1M      |\n| [libritts_multi_band_melgan.v2](https://drive.google.com/open?id=1jfB15igea6tOQ0hZJGIvnpf3QyNhTLnq)          | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/libritts/voc1/conf/multi_band_melgan.v2.yaml)         | EN    | 24k     | 80-7600        | 2048 / 300 / 1200    | 1M      |\n| [libritts_hifigan.v1](https://drive.google.com/open?id=10jBLsjQT3LvR-3GgPZpRvWIWvpGjzDnM)                    | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/libritts/voc1/conf/hifigan.v1.yaml)                   | EN    | 24k     | 80-7600        | 2048 / 300 / 1200    | 2.5M    |\n| [libritts_style_melgan.v1](https://drive.google.com/open?id=1OPpYbrqYOJ_hHNGSQHzUxz_QZWWBwV9r)               | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/libritts/voc1/conf/style_melgan.v1.yaml)              | EN    | 24k     | 80-7600        | 2048 / 300 / 1200    | 1.5M    |\n| [kss_parallel_wavegan.v1](https://drive.google.com/open?id=1n5kitXZqPHUr-veoUKCyfJvb3p1g0VlY)                | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/libritts/voc1/conf/parallel_wavegan.v1.yaml)          | KO    | 24k     | 80-7600        | 2048 / 300 / 1200    | 400k    |\n| [hui_acg_hokuspokus_parallel_wavegan.v1](https://drive.google.com/open?id=1rwzpIwb65xbW5fFPsqPWdforsk4U-vDg) | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/libritts/voc1/conf/parallel_wavegan.v1.yaml)          | DE    | 24k     | 80-7600        | 2048 / 300 / 1200    | 400k    |\n| [ruslan_parallel_wavegan.v1](https://drive.google.com/open?id=1QGuesaRKGful0bUTTaFZdbjqHNhy2LpE)             | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/libritts/voc1/conf/parallel_wavegan.v1.yaml)          | RU    | 24k     | 80-7600        | 2048 / 300 / 1200    | 400k    |\n| [oniku_hifigan.v1](https://drive.google.com/open?id=1K1WNqmZVJaZqTwWNVcucZNeGKHu8-LVm)                       | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/oniku_kurumi_utagoe_db/voc1/conf/hifigan.v1.yaml)     | JP    | 24k     | 80-7600        | 2048 / 300 / 1200    | 250k    |\n| [kiritan_hifigan.v1](https://drive.google.com/open?id=1FHUUF5uUnlJ9-D7HmXw3_Sn_GRS48I36)                     | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/kiritan/voc1/conf/hifigan.v1.yaml)                    | JP    | 24k     | 80-7600        | 2048 / 300 / 1200    | 300k    |\n| [ofuton_hifigan.v1](https://drive.google.com/open?id=1fq8ITA2KpdtrzzD2hOlroParMg-qKjr7)                      | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/ofuton_p_utagoe_db/voc1/conf/hifigan.v1.yaml)         | JP    | 24k     | 80-7600        | 2048 / 300 / 1200    | 300k    |\n| [opencpop_hifigan.v1](https://drive.google.com/open?id=1hMf5yew_MrbPW0qy5qzXn0mxqbfHTadC)                    | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/opencpop/voc1/conf/hifigan.v1.yaml)                   | ZH    | 24k     | 80-7600        | 2048 / 300 / 1200    | 250k    |\n| [csd_english_hifigan.v1](https://drive.google.com/open?id=1NACjfBqmaecwh4dZMl714RukEkV8XLAi)                 | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/csd/voc1/conf/hifigan.v1.yaml)                        | EN    | 24k     | 80-7600        | 2048 / 300 / 1200    | 300k    |\n| [csd_korean_hifigan.v1](https://drive.google.com/open?id=1BGxIoRg4VgXcX0G-4Dwea030-qQ_Ynyp)                  | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/csd/voc1/conf/hifigan.v1.yaml)                        | EN    | 24k     | 80-7600        | 2048 / 300 / 1200    | 250k    |\n| [kising_hifigan.v1](https://drive.google.com/open?id=1GGu3pW89qxmJapd0Vm1aqp6lqgZARLO9)                      | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/kising/voc1/conf/hifigan.v1.yaml)                     | ZH    | 24k     | 80-7600        | 2048 / 300 / 1200    | 300k    |\n| [m4singer_hifigan.v1](https://drive.google.com/open?id=1dvD6imY6p2L80tN8tr_kzqUa3M7QJtLY)                    | [link](https://github.com/kan-bayashi/ParallelWaveGAN/blob/master/egs/m4singer/voc1/conf/hifigan.v1.yaml)                 | ZH    | 24k     | 80-7600        | 2048 / 300 / 1200     | 1M      |\n\n\n\n\nPlease access at [our google drive](https://drive.google.com/open?id=1sd_QzcUNnbiaWq7L0ykMP7Xmk-zOuxTi) to check more results.\n\nPlease check the license of database (e.g., whether it is proper for commercial usage) before using the pre-trained model.   \nThe authors will not be responsible for any loss due to the use of the model and legal disputes regarding the use of the dataset.\n\n## How-to-use pretrained models\n\n### Analysis-synthesis\n\nHere the minimal code is shown to perform analysis-synthesis using the pretrained model.\n\n```bash\n# Please make sure you installed `parallel_wavegan`\n# If not, please install via pip\n$ pip install parallel_wavegan\n\n# You can download the pretrained model from terminal\n$ python \u003c\u003c EOF\nfrom parallel_wavegan.utils import download_pretrained_model\ndownload_pretrained_model(\"\u003cpretrained_model_tag\u003e\", \"pretrained_model\")\nEOF\n\n# You can get all of available pretrained models as follows:\n$ python \u003c\u003c EOF\nfrom parallel_wavegan.utils import PRETRAINED_MODEL_LIST\nprint(PRETRAINED_MODEL_LIST.keys())\nEOF\n\n# Now you can find downloaded pretrained model in `pretrained_model/\u003cpretrain_model_tag\u003e/`\n$ ls pretrain_model/\u003cpretrain_model_tag\u003e\n  checkpoint-400000steps.pkl    config.yml    stats.h5\n\n# These files can also be downloaded manually from the above results\n\n# Please put an audio file in `sample` directory to perform analysis-synthesis\n$ ls sample/\n  sample.wav\n\n# Then perform feature extraction -\u003e feature normalization -\u003e synthesis\n$ parallel-wavegan-preprocess \\\n    --config pretrain_model/\u003cpretrain_model_tag\u003e/config.yml \\\n    --rootdir sample \\\n    --dumpdir dump/sample/raw\n100%|████████████████████████████████████████| 1/1 [00:00\u003c00:00, 914.19it/s]\n$ parallel-wavegan-normalize \\\n    --config pretrain_model/\u003cpretrain_model_tag\u003e/config.yml \\\n    --rootdir dump/sample/raw \\\n    --dumpdir dump/sample/norm \\\n    --stats pretrain_model/\u003cpretrain_model_tag\u003e/stats.h5\n2019-11-13 13:44:29,574 (normalize:87) INFO: the number of files = 1.\n100%|████████████████████████████████████████| 1/1 [00:00\u003c00:00, 513.13it/s]\n$ parallel-wavegan-decode \\\n    --checkpoint pretrain_model/\u003cpretrain_model_tag\u003e/checkpoint-400000steps.pkl \\\n    --dumpdir dump/sample/norm \\\n    --outdir sample\n2019-11-13 13:44:31,229 (decode:91) INFO: the number of features to be decoded = 1.\n[decode]: 100%|███████████████████| 1/1 [00:00\u003c00:00, 18.33it/s, RTF=0.0146]\n2019-11-13 13:44:37,132 (decode:129) INFO: finished generation of 1 utterances (RTF = 0.015).\n\n# You can skip normalization step (on-the-fly normalization, feature extraction -\u003e synthesis)\n$ parallel-wavegan-preprocess \\\n    --config pretrain_model/\u003cpretrain_model_tag\u003e/config.yml \\\n    --rootdir sample \\\n    --dumpdir dump/sample/raw\n100%|████████████████████████████████████████| 1/1 [00:00\u003c00:00, 914.19it/s]\n$ parallel-wavegan-decode \\\n    --checkpoint pretrain_model/\u003cpretrain_model_tag\u003e/checkpoint-400000steps.pkl \\\n    --dumpdir dump/sample/raw \\\n    --normalize-before \\\n    --outdir sample\n2019-11-13 13:44:31,229 (decode:91) INFO: the number of features to be decoded = 1.\n[decode]: 100%|███████████████████| 1/1 [00:00\u003c00:00, 18.33it/s, RTF=0.0146]\n2019-11-13 13:44:37,132 (decode:129) INFO: finished generation of 1 utterances (RTF = 0.015).\n\n# you can find the generated speech in `sample` directory\n$ ls sample\n  sample.wav    sample_gen.wav\n```\n\n### Decoding with ESPnet-TTS model's features\n\nHere, I show the procedure to generate waveforms with features generated by [ESPnet-TTS](https://github.com/espnet/espnet) models.\n\n```bash\n# Make sure you already finished running the recipe of ESPnet-TTS.\n# You must use the same feature settings for both Text2Mel and Mel2Wav models.\n# Let us move on \"ESPnet\" recipe directory\n$ cd /path/to/espnet/egs/\u003crecipe_name\u003e/tts1\n$ pwd\n/path/to/espnet/egs/\u003crecipe_name\u003e/tts1\n\n# If you use ESPnet2, move on `egs2/`\n$ cd /path/to/espnet/egs2/\u003crecipe_name\u003e/tts1\n$ pwd\n/path/to/espnet/egs2/\u003crecipe_name\u003e/tts1\n\n# Please install this repository in ESPnet conda (or virtualenv) environment\n$ . ./path.sh \u0026\u0026 pip install -U parallel_wavegan\n\n# You can download the pretrained model from terminal\n$ python \u003c\u003c EOF\nfrom parallel_wavegan.utils import download_pretrained_model\ndownload_pretrained_model(\"\u003cpretrained_model_tag\u003e\", \"pretrained_model\")\nEOF\n\n# You can get all of available pretrained models as follows:\n$ python \u003c\u003c EOF\nfrom parallel_wavegan.utils import PRETRAINED_MODEL_LIST\nprint(PRETRAINED_MODEL_LIST.keys())\nEOF\n\n# You can find downloaded pretrained model in `pretrained_model/\u003cpretrain_model_tag\u003e/`\n$ ls pretrain_model/\u003cpretrain_model_tag\u003e\n  checkpoint-400000steps.pkl    config.yml    stats.h5\n\n# These files can also be downloaded manually from the above results\n```\n\n**Case 1**: If you use the same dataset for both Text2Mel and Mel2Wav\n\n```bash\n# In this case, you can directly use generated features for decoding.\n# Please specify `feats.scp` path for `--feats-scp`, which is located in\n# exp/\u003cyour_model_dir\u003e/outputs_*_decode/\u003cset_name\u003e/feats.scp.\n# Note that do not use outputs_*decode_denorm/\u003cset_name\u003e/feats.scp since\n# it is de-normalized features (the input for PWG is normalized features).\n$ parallel-wavegan-decode \\\n    --checkpoint pretrain_model/\u003cpretrain_model_tag\u003e/checkpoint-400000steps.pkl \\\n    --feats-scp exp/\u003cyour_model_dir\u003e/outputs_*_decode/\u003cset_name\u003e/feats.scp \\\n    --outdir \u003cpath_to_outdir\u003e\n\n# In the case of ESPnet2, the generated feature can be found in\n# exp/\u003cyour_model_dir\u003e/decode_*/\u003cset_name\u003e/norm/feats.scp.\n$ parallel-wavegan-decode \\\n    --checkpoint pretrain_model/\u003cpretrain_model_tag\u003e/checkpoint-400000steps.pkl \\\n    --feats-scp exp/\u003cyour_model_dir\u003e/decode_*/\u003cset_name\u003e/norm/feats.scp \\\n    --outdir \u003cpath_to_outdir\u003e\n\n# You can find the generated waveforms in \u003cpath_to_outdir\u003e/.\n$ ls \u003cpath_to_outdir\u003e\n  utt_id_1_gen.wav    utt_id_2_gen.wav  ...    utt_id_N_gen.wav\n```\n\n**Case 2**: If you use different datasets for Text2Mel and Mel2Wav models\n\n```bash\n# In this case, you must provide `--normalize-before` option additionally.\n# And use `feats.scp` of de-normalized generated features.\n\n# ESPnet1 case\n$ parallel-wavegan-decode \\\n    --checkpoint pretrain_model/\u003cpretrain_model_tag\u003e/checkpoint-400000steps.pkl \\\n    --feats-scp exp/\u003cyour_model_dir\u003e/outputs_*_decode_denorm/\u003cset_name\u003e/feats.scp \\\n    --outdir \u003cpath_to_outdir\u003e \\\n    --normalize-before\n\n# ESPnet2 case\n$ parallel-wavegan-decode \\\n    --checkpoint pretrain_model/\u003cpretrain_model_tag\u003e/checkpoint-400000steps.pkl \\\n    --feats-scp exp/\u003cyour_model_dir\u003e/decode_*/\u003cset_name\u003e/denorm/feats.scp \\\n    --outdir \u003cpath_to_outdir\u003e \\\n    --normalize-before\n\n# You can find the generated waveforms in \u003cpath_to_outdir\u003e/.\n$ ls \u003cpath_to_outdir\u003e\n  utt_id_1_gen.wav    utt_id_2_gen.wav  ...    utt_id_N_gen.wav\n```\n\nIf you want to combine these models in python, you can try the real-time demonstration in Google Colab!\n- Real-time demonstration with ESPnet2  [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/espnet/notebook/blob/master/espnet2_tts_realtime_demo.ipynb)\n- Real-time demonstration with ESPnet1  [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/espnet/notebook/blob/master/tts_realtime_demo.ipynb)\n\n### Decoding with dumped npy files\n\nSometimes we want to decode with dumped npy files, which are mel-spectrogram generated by TTS models.\nPlease make sure you used the same feature extraction settings of the pretrained vocoder (`fs`, `fft_size`, `hop_size`, `win_length`, `fmin`, and `fmax`).  \nOnly the difference of `log_base` can be changed with some post-processings (we use log 10 instead of natural log as a default).\nSee detail in [the comment](https://github.com/kan-bayashi/ParallelWaveGAN/issues/169#issuecomment-649320778).\n\n```bash\n# Generate dummy npy file of mel-spectrogram\n$ ipython\n[ins] In [1]: import numpy as np\n[ins] In [2]: x = np.random.randn(512, 80)  # (#frames, #mels)\n[ins] In [3]: np.save(\"dummy_1.npy\", x)\n[ins] In [4]: y = np.random.randn(256, 80)  # (#frames, #mels)\n[ins] In [5]: np.save(\"dummy_2.npy\", y)\n[ins] In [6]: exit\n\n# Make scp file (key-path format)\n$ find -name \"*.npy\" | awk '{print \"dummy_\" NR \" \" $1}' \u003e feats.scp\n\n# Check (\u003cutt_id\u003e \u003cpath\u003e)\n$ cat feats.scp\ndummy_1 ./dummy_1.npy\ndummy_2 ./dummy_2.npy\n\n# Decode without feature normalization\n# This case assumes that the input mel-spectrogram is normalized with the same statistics of the pretrained model.\n$ parallel-wavegan-decode \\\n    --checkpoint /path/to/checkpoint-400000steps.pkl \\\n    --feats-scp ./feats.scp \\\n    --outdir wav\n2021-08-10 09:13:07,624 (decode:140) INFO: The number of features to be decoded = 2.\n[decode]: 100%|████████████████████████████████████████| 2/2 [00:00\u003c00:00, 13.84it/s, RTF=0.00264]\n2021-08-10 09:13:29,660 (decode:174) INFO: Finished generation of 2 utterances (RTF = 0.005).\n\n# Decode with feature normalization\n# This case assumes that the input mel-spectrogram is not normalized.\n$ parallel-wavegan-decode \\\n    --checkpoint /path/to/checkpoint-400000steps.pkl \\\n    --feats-scp ./feats.scp \\\n    --normalize-before \\\n    --outdir wav\n2021-08-10 09:13:07,624 (decode:140) INFO: The number of features to be decoded = 2.\n[decode]: 100%|████████████████████████████████████████| 2/2 [00:00\u003c00:00, 13.84it/s, RTF=0.00264]\n2021-08-10 09:13:29,660 (decode:174) INFO: Finished generation of 2 utterances (RTF = 0.005).\n```\n\n## Notes\n\n- The terms of use of the pretrained model follow that of each corpus used for the training. Please carefully check by yourself.  \n- Some codes are derived from ESPnet or Kaldi, which are based on Apache-2.0 licenese.\n\n## References\n\n- [Parallel WaveGAN](https://arxiv.org/abs/1910.11480)\n- [r9y9/wavenet_vocoder](https://github.com/r9y9/wavenet_vocoder)\n- [LiyuanLucasLiu/RAdam](https://github.com/LiyuanLucasLiu/RAdam)\n- [MelGAN](https://arxiv.org/abs/1910.06711)\n- [descriptinc/melgan-neurips](https://github.com/descriptinc/melgan-neurips)\n- [Multi-band MelGAN](https://arxiv.org/abs/2005.05106)\n- [HiFi-GAN](https://arxiv.org/abs/2010.05646)\n- [jik876/hifi-gan](https://github.com/jik876/hifi-gan)\n- [StyleMelGAN](https://arxiv.org/abs/2011.01557)\n\n## Acknowledgement\n\nThe author would like to thank Ryuichi Yamamoto ([@r9y9](https://github.com/r9y9)) for his great repository, paper, and valuable discussions.\n\n## Author\n\nTomoki Hayashi ([@kan-bayashi](https://github.com/kan-bayashi))  \nE-mail: `hayashi.tomoki\u003cat\u003eg.sp.m.is.nagoya-u.ac.jp`\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkan-bayashi%2FParallelWaveGAN","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkan-bayashi%2FParallelWaveGAN","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkan-bayashi%2FParallelWaveGAN/lists"}