An open API service indexing awesome lists of open source software.

https://github.com/sony/bigvsan_eval

Evaluation tool used in the BigVSAN paper
https://github.com/sony/bigvsan_eval

pytorch

Last synced: 6 months ago
JSON representation

Evaluation tool used in the BigVSAN paper

Awesome Lists containing this project

README

          

# Vocoder Evaluation

This repository contains the evaluation tool used in **"BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network"** (*[arXiv 2309.02836](https://arxiv.org/abs/2309.02836)*).
Please cite [[1](#citation)] in your work when using this code in your experiments.

## Quick Start

First, prepare an environment
```shell
pip install -r requirements.txt
```

Then, perform an evaluation
```shell
python evaluate.py ...
```
```gt_dir n``` means a directory that contains ground-truth audio files, and ```synth_dir n``` means a directory that contains synthesized audio files. Each file in ```synth_dir n``` needs to have the corresponding file that has the same name in ```gt_dir n```. Also, a corresponding pair needs to be time-aligned in advance.

```evaluate.py``` will output calculated metrics for each ```gt_dir n```-```synth_dir n``` pair and the macro averages of them across all pairs. It will take some time to complete an evaluation.

## Supported evaluation metrics
This toolbox supports the following metrics:

- M-STFT: Multi-resolution short-term Fourier transform
- PESQ: Perceptual evaluation of speech quality
- MCD: Mel-cepstral distortion
- Periodicity: Periodicity error
- V/UV F1: F1 score of voiced/unvoiced classification

## Citation

If you find this tool useful, please consider citing

[1] Shibuya, T., Takida, Y., Mitsufuji, Y.,
"BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network,"
ICASSP 2024.
```bibtex
@inproceedings{shibuya2024bigvsan,
title={{BigVSAN}: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network},
author={Shibuya, Takashi and Takida, Yuhta and Mitsufuji, Yuki},
booktitle={ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
year={2024}
}
```

## References

> https://github.com/NVIDIA/BigVGAN

> https://github.com/csteinmetz1/auraloss

> https://github.com/ludlows/PESQ

> https://github.com/ttslr/python-MCD

> https://github.com/descriptinc/cargan