https://github.com/Alexander-H-Liu/NPC
Non-Autoregressive Predictive Coding
https://github.com/Alexander-H-Liu/NPC
Last synced: about 1 month ago
JSON representation
Non-Autoregressive Predictive Coding
- Host: GitHub
- URL: https://github.com/Alexander-H-Liu/NPC
- Owner: Alexander-H-Liu
- License: mit
- Created: 2020-10-27T20:15:40.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2020-11-03T07:54:06.000Z (over 4 years ago)
- Last Synced: 2024-10-30T01:43:25.302Z (6 months ago)
- Language: Python
- Size: 168 KB
- Stars: 50
- Watchers: 4
- Forks: 5
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Non-Autoregressive Predictive Coding
This repository contains the implementation of Non-Autoregressive Predictive Coding (NPC) as described in [the preprint paper](https://arxiv.org/abs/2011.00406) submitted to ICASSP 2021.
A quick example for training NPC
```
python main.py --config config/self_supervised/npc_example.yml \
--task self-learning
```- For more complete examples including downstream tasks, please see [the example script](eg.sh).
- For preparing data, please visit [preprocess](preprocess/).
- For detailed hyperparameters setting and description, please checkout [example config file of NPC](config/self_supervised/npc_example.yml).
- For all run-time options, use `-h` flag.
- Implementation of [Autoregressive Predictive Coding](https://arxiv.org/abs/1910.12607) (APC, 2019, Chung *et al*.) and [Vector-Quantized APC](https://arxiv.org/abs/2005.08392) (VQ-APC, 2020, Chung *et al*.) are also available using similar training/downstream execution with example config files [here](config/self_supervised/vqapc_example.yml).
## Some notes
- We found the unmasked feature produced by the last ConvBlock layer a better representation. In the phone classification tasks, switching to the unmasked feature (PER 25.6%) provided a 1.6% improvement over the masked feature (PER 27.2%). Currently, this is not included in the preprint version and will be updated to the paper in the future. Please refer to [downstream examples](config/downstream) to activate this option.
- APC/VQ-APC are implemented with the following modifications for improvement (for the unmodified version, please visit the official implementation of [APC](https://github.com/iamyuanchung/Autoregressive-Predictive-Coding) / [VQAPC](https://github.com/iamyuanchung/VQ-APC/tree/96230cc358b174b736b4c0e7664b3e72b304d9b0))
- Multi-group VQ available for VQ-APC, but with VQ on last layer only
- Using utterance-wised CMVN surface feature(just as NPC did)
- Using Gumbel Softmax from [official API](https://pytorch.org/docs/stable/nn.functional.html#gumbel-softmax) of pytorch
- See [package requirement](requirements.txt) for toolkits used, `tensorboard` can be used to access log files in `--logdir`.
## Contact
Feel free to contact me for questions or feedbacks, my email can be found in the paper or my [personal page](https://alexander-h-liu.github.io).
## Citation
If you find our work and/or this repository helpful, please do consider citing us
```
@article{liu2020nonautoregressive,
title = {Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies},
author = {Liu, Alexander and Chung, Yu-An and Glass, James},
journal = {arXiv preprint arXiv:2011.00406},
year = {2020}
}
```