Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/yoyololicon/constant-memory-waveglow
PyTorch implementation of NVIDIA WaveGlow with constant memory cost.
https://github.com/yoyololicon/constant-memory-waveglow
convstant-memory flows glow normalizing-flows nvidia pytorch waveflow waveglow wavenet
Last synced: 3 months ago
JSON representation
PyTorch implementation of NVIDIA WaveGlow with constant memory cost.
- Host: GitHub
- URL: https://github.com/yoyololicon/constant-memory-waveglow
- Owner: yoyololicon
- Created: 2018-11-30T02:08:52.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2023-01-28T10:01:30.000Z (about 2 years ago)
- Last Synced: 2024-10-03T12:38:36.514Z (4 months ago)
- Topics: convstant-memory, flows, glow, normalizing-flows, nvidia, pytorch, waveflow, waveglow, wavenet
- Language: Python
- Homepage:
- Size: 23.9 MB
- Stars: 34
- Watchers: 7
- Forks: 6
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
Awesome Lists containing this project
README
# Constant Memory WaveGlow
[![DOI](https://zenodo.org/badge/159754913.svg)](https://zenodo.org/badge/latestdoi/159754913)A PyTorch implementation of
[WaveGlow: A Flow-based Generative Network for Speech Synthesis](https://arxiv.org/abs/1811.00002)
using constant memory method described in [Training Glow with Constant
Memory Cost](http://bayesiandeeplearning.org/2018/papers/37.pdf).The model implementation details are slightly differed from the
[official implementation](https://github.com/NVIDIA/waveglow) based on
personal favor, and the project structure is brought from
[pytorch-template](https://github.com/victoresque/pytorch-template).Besides, we also add implementations of Baidu's [WaveFlow](https://arxiv.org/abs/1912.01219), and [MelGlow](https://arxiv.org/abs/2012.01684),
which are easier to train and more memory fiendly.In addition to neural vocoder, we also add an implementation of audio super-resolution model [WSRGlow](https://arxiv.org/abs/2106.08507).
## Requirements
After install the requirements from [pytorch-template](https://github.com/victoresque/pytorch-template#requirements):
```commandline
pip install nnAudio torch_optimizer
```## Quick Start
Modify the `data_dir` in the json file to a directory which has a bunch of wave files with the same sampling rate,
then your are good to go. The mel-spectrogram will be computed on the fly.```json
{
"data_loader": {
"type": "RandomWaveFileLoader",
"args": {
"data_dir": "/your/data/wave/files",
"batch_size": 8,
"num_workers": 2,
"segment": 16000
}
}
}
``````
python train.py -c config.json
```## Memory consumption of model training in PyTorch
| Model | Memory (MB) |
---------------------------------------------------|:-------------:|
| WaveGlow, channels=256, batch size=24 (naive) | N.A. |
| WaveGlow, channels=256, batch size=24 (efficient)| 4951 |## Result
### WaveGlow
I trained the model on some cello music pieces from MusicNet using the `musicnet_config.json`.
The clips in the `samples` folder is what I got. Although the audio quality is not very good, it's possible to use
WaveGlow on music generation as well.
The generation speed is around 470kHz on a 1080ti.### WaveFlow
I trained on full LJ speech dataset using the `waveflow_LJ_speech.json`. The settings are corresponding to the **64 residual channels, h=64** model in the paper. After training about 1.25M steps, the audio quality is very similiar to their official examples.
Samples generated from training data can be listened [here](samples/waveflow_64chs).### MelGlow
Coming soon.
### WSRGlow
Pre-trained models on VCTK dataset are available [here](). We follow the settings of [NU-Wave](https://arxiv.org/abs/2104.02321) to get the training data.
## Citation
If you use our code on any project and research, please cite:```bibtex
@misc{memwaveglow,
doi = {10.5281/zenodo.3874330},
author = {Chin Yun Yu},
title = {Constant Memory WaveGlow: A PyTorch implementation of WaveGlow with constant memory cost},
howpublished = {\url{https://github.com/yoyololicon/constant-memory-waveglow}},
year = {2019}
}
```