Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/marianne-m/brouhaha-vad
Predicts the level of noise and reverberation on your audiofiles
https://github.com/marianne-m/brouhaha-vad
Last synced: 2 days ago
JSON representation
Predicts the level of noise and reverberation on your audiofiles
- Host: GitHub
- URL: https://github.com/marianne-m/brouhaha-vad
- Owner: marianne-m
- License: mit
- Created: 2022-09-20T11:46:25.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-05-22T13:12:10.000Z (6 months ago)
- Last Synced: 2024-08-04T00:11:06.769Z (3 months ago)
- Language: Jupyter Notebook
- Homepage:
- Size: 90.7 MB
- Stars: 122
- Watchers: 10
- Forks: 22
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
[![Installation Testing](https://github.com/marianne-m/brouhaha-vad/actions/workflows/setup.yml/badge.svg)](https://github.com/marianne-m/brouhaha-vad/actions/workflows/setup.yml)
# Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation (2023)
![](doc/brouhaha.png)
Here's the companion repository of [*Brouhaha*](https://arxiv.org/abs/2210.13248).
You'll find the instructions to install and run our pretrained model. Given an audio segment, Brouhaha extracts:
- Speech/non-speech segments
- Speech-to-Noise Ratio (SNR) , that measures the speech level compared to the noise level
- C50, that measures to which extent the environment is reverberantYou can listen to some audio samples we generated to train the model [here](https://marvinlvn.github.io/projects/1_project/).
If you want to dig further, you'll also find the instructions to run the audio contamination pipeline, and retrain a model from scratch.
### Installation
```
# clone brouhaha
git clone https://github.com/marianne-m/brouhaha-vad.git
cd brouhaha-vad# creating a conda environment
conda create -n brouhaha python=3.8
conda activate brouhaha# install brouhaha
pip install .
```Depending on the environment you're running the model in, it may be necessary to install libsndfile with the following command:
```
conda install -c conda-forge libsndfile
```### Extract predictions
```
python brouhaha/main.py apply \
--data_dir path/to/data \
--out_dir path/to/predictions \
--model_path models/best/checkpoints/best.ckpt \
--ext wav
```### Going further
1) [Run the audio contamination pipeline](https://github.com/marianne-m/brouhaha-maker)
2) [Train your own model](./doc/training.md)### Citation
```bibtex
@article{lavechin2023brouhaha,
Title = {{Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation}},
Author = {Marvin Lavechin and Marianne Métais and Hadrien Titeux and Alodie Boissonnet and Jade Copet and Morgane Rivière and Elika Bergelson and Alejandrina Cristia and Emmanuel Dupoux and Hervé Bredin},
Year = {2023},
Journal = {ASRU}
}@inproceedings{Bredin2020,
Title = {{pyannote.audio: neural building blocks for speaker diarization}},
Author = {{Bredin}, Herv{\'e} and {Yin}, Ruiqing and {Coria}, Juan Manuel and {Gelly}, Gregory and {Korshunov}, Pavel and {Lavechin}, Marvin and {Fustes}, Diego and {Titeux}, Hadrien and {Bouaziz}, Wassim and {Gill}, Marie-Philippe},
Booktitle = {ICASSP 2020, IEEE International Conference on Acoustics, Speech, and Signal Processing},
Address = {Barcelona, Spain},
Month = {May},
Year = {2020},
}
```