Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/haoxiangsnr/A-Convolutional-Recurrent-Neural-Network-for-Real-Time-Speech-Enhancement
A minimum unofficial implementation of the "A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement" (CRN) using PyTorch
https://github.com/haoxiangsnr/A-Convolutional-Recurrent-Neural-Network-for-Real-Time-Speech-Enhancement
cnn cnn-rnn pytorch real-time rnn speech-enhancement speech-processing
Last synced: 3 months ago
JSON representation
A minimum unofficial implementation of the "A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement" (CRN) using PyTorch
- Host: GitHub
- URL: https://github.com/haoxiangsnr/A-Convolutional-Recurrent-Neural-Network-for-Real-Time-Speech-Enhancement
- Owner: haoxiangsnr
- Created: 2019-08-12T08:02:05.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2020-09-05T00:50:59.000Z (about 4 years ago)
- Last Synced: 2024-05-16T12:58:25.953Z (6 months ago)
- Topics: cnn, cnn-rnn, pytorch, real-time, rnn, speech-enhancement, speech-processing
- Language: Python
- Homepage:
- Size: 43.9 KB
- Stars: 293
- Watchers: 7
- Forks: 53
- Open Issues: 9
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement
A minimum unofficial implementation of the [A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement (CRN)](https://www.isca-speech.org/archive/Interspeech_2018/pdfs/1405.pdf) using PyTorch.
## ToDo
- [x] Real-time version
- [x] Update trainer
- [x] Visualization of the spectrogram and the metrics (PESQ, STOI, SI-SDR) in the training
- [ ] More docs## Usage
Training:
```
python train.py -C config/train/baseline_model.json5
```Inference:
```
python inference.py \
-C config/inference/basic.json5 \
-cp ~/Experiments/CRN/baseline_model/checkpoints/latest_model.tar \
-dist ./enhanced
```Check out the README of [Wave-U-Net for SE](https://github.com/haoxiangsnr/Wave-U-Net-for-Speech-Enhancement) to learn more.
## Performance
PESQ, STOI, SI-SDR on DEMAND - Voice Bank test dataset, for reference only:
| Experiment | PESQ | SI-SDR | STOI |
| --- | --- | --- | --- |
|Noisy | 1.979 | 8.511| 0.9258|
|CRN | 2.528| 17.71| 0.9325|
|CRN signal approximation |2.606 |17.84 |0.9382|## Dependencies
- Python==3.\*.\*
- torch==1.\*
- librosa==0.7.0
- tensorboard
- pesq
- pystoi
- matplotlib
- tqdm## References
- [CRNN_mapping_baseline](https://github.com/YangYang/CRNN_mapping_baseline)
- [A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement](https://web.cse.ohio-state.edu/~wang.77/papers/Tan-Wang1.interspeech18.pdf)
- [EHNet](https://github.com/ododoyo/EHNet)
- [Convolutional-Recurrent Neural Networks for Speech Enhancement](https://arxiv.org/abs/1805.00579)