Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/marianne-m/brouhaha-vad

Predicts the level of noise and reverberation on your audiofiles
https://github.com/marianne-m/brouhaha-vad

Last synced: 3 months ago
JSON representation

Predicts the level of noise and reverberation on your audiofiles

Host: GitHub
URL: https://github.com/marianne-m/brouhaha-vad
Owner: marianne-m
License: mit
Created: 2022-09-20T11:46:25.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2024-05-22T13:12:10.000Z (9 months ago)
Last Synced: 2024-08-04T00:11:06.769Z (6 months ago)
Language: Jupyter Notebook
Homepage:
Size: 90.7 MB
Stars: 122
Watchers: 10
Forks: 22
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        [![Installation Testing](https://github.com/marianne-m/brouhaha-vad/actions/workflows/setup.yml/badge.svg)](https://github.com/marianne-m/brouhaha-vad/actions/workflows/setup.yml)

# Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation (2023)

![](doc/brouhaha.png)

Here's the companion repository of [*Brouhaha*](https://arxiv.org/abs/2210.13248). 

You'll find the instructions to install and run our pretrained model. Given an audio segment, Brouhaha extracts:

- Speech/non-speech segments

- Speech-to-Noise Ratio (SNR) , that measures the speech level compared to the noise level

- C50, that measures to which extent the environment is reverberant

You can listen to some audio samples we generated to train the model [here](https://marvinlvn.github.io/projects/1_project/).

If you want to dig further, you'll also find the instructions to run the audio contamination pipeline, and retrain a model from scratch.

### Installation

```

# clone brouhaha

git clone https://github.com/marianne-m/brouhaha-vad.git

cd brouhaha-vad

# creating a conda environment

conda create -n brouhaha python=3.8

conda activate brouhaha

# install brouhaha

pip install .

```

Depending on the environment you're running the model in, it may be necessary to install libsndfile with the following command:

```

conda install -c conda-forge libsndfile

```

### Extract predictions

```

python brouhaha/main.py apply \

      --data_dir path/to/data \

      --out_dir path/to/predictions \

      --model_path models/best/checkpoints/best.ckpt \

      --ext wav

```

### Going further

1) [Run the audio contamination pipeline](https://github.com/marianne-m/brouhaha-maker)

2) [Train your own model](./doc/training.md)

### Citation

```bibtex

@article{lavechin2023brouhaha,

  Title   = {{Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation}},

  Author  = {Marvin Lavechin and Marianne Métais and Hadrien Titeux and Alodie Boissonnet and Jade Copet and Morgane Rivière and Elika Bergelson and Alejandrina Cristia and Emmanuel Dupoux and Hervé Bredin},

  Year    = {2023},

  Journal = {ASRU}

}

@inproceedings{Bredin2020,

  Title = {{pyannote.audio: neural building blocks for speaker diarization}},

  Author = {{Bredin}, Herv{\'e} and {Yin}, Ruiqing and {Coria}, Juan Manuel and {Gelly}, Gregory and {Korshunov}, Pavel and {Lavechin}, Marvin and {Fustes}, Diego and {Titeux}, Hadrien and {Bouaziz}, Wassim and {Gill}, Marie-Philippe},

  Booktitle = {ICASSP 2020, IEEE International Conference on Acoustics, Speech, and Signal Processing},

  Address = {Barcelona, Spain},

  Month = {May},

  Year = {2020},

}

```