Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jzi040941/PercepNet
Unofficial implementation of PercepNet: A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech
https://github.com/jzi040941/PercepNet
pytorch speech-enhancement
Last synced: 2 months ago
JSON representation
Unofficial implementation of PercepNet: A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech
- Host: GitHub
- URL: https://github.com/jzi040941/PercepNet
- Owner: jzi040941
- License: bsd-3-clause
- Created: 2020-10-19T07:51:55.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2023-01-22T09:36:39.000Z (almost 2 years ago)
- Last Synced: 2024-08-04T13:04:03.528Z (6 months ago)
- Topics: pytorch, speech-enhancement
- Language: C++
- Homepage:
- Size: 31.6 MB
- Stars: 318
- Watchers: 28
- Forks: 91
- Open Issues: 20
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# PercepNet
Unofficial implementation of PercepNet: A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech described in https://arxiv.org/abs/2008.04259https://www.researchgate.net/publication/343568932_A_Perceptually-Motivated_Approach_for_Low-Complexity_Real-Time_Enhancement_of_Fullband_Speech
## Todo
- [X] pitch estimation
- [X] Comb filter
- [X] ERBBand c++ implementation
- [X] Feature(r,g,pitch,corr) Generator(c++) for pytorch
- [X] DNNModel pytorch
- [X] DNNModel c++ implementation
- [ ] Pretrained model
- [X] Postfiltering (done by [@TeaPoly](https://github.com/TeaPoly ) )## Requirements
- CMake
- Sox
- Python>=3.6
- Pytorch## Prepare sampledata
1. download and sythesize data DNS-Challenge 2020 Dataset before excute utils/run.sh for training.
```shell
git clone -b interspeech2020/master https://github.com/microsoft/DNS-Challenge.git
```
2. Follow the Usage instruction in DNS Challenge repo(https://github.com/microsoft/DNS-Challenge) at interspeech2020/master branch. please modify save directories at DNS-Challenge/noisyspeech_synthesizer.cfg sampledata/speech and sampledata/noise each.## Build & Training
This repository is tested on Ubuntu 20.04(WSL2)1. setup CMake build environments
```
sudo apt-get install cmake
```
2. make binary directory & build
```
mkdir bin && cd bin
cmake ..
make -j
cd ..
```3. feature generation for training with sampleData
```
bin/src/percepNet sampledata/speech/speech.pcm sampledata/noise/noise.pcm 4000 test.output
```4. Convert output binary to h5
```
python3 utils/bin2h5.py test.output training.h5
```5. Training
run utils/run.sh
```shell
cd utils
./run.sh
```6. Dump weight from pytorch to c++ header
```
python3 dump_percepnet.py model.pt
```7. Inference
```
cd bin
cmake ..
make -j1
cd ..
bin/src/percepNet_run test_input.pcm percepnet_output.pcm
```## Acknowledgements
[@jasdasdf]( https://github.com/jasdasdf ), [@sTarAnna]( https://github.com/sTarAnna ), [@cookcodes]( https://github.com/cookcodes ), [@xyx361100238]( https://github.com/xyx361100238 ), [@zhangyutf]( https://github.com/zhangyutf ), [@TeaPoly](https://github.com/TeaPoly ), [@rameshkunasi]( https://github.com/rameshkunasi ), [@OscarLiau]( https://github.com/OscarLiau ), [@YangangCao]( https://github.com/YangangCao ), [Jaeyoung Yang]( https://www.linkedin.com/in/jaeyoung-yang-354b21146 )[IIP Lab. Sogang Univ]( http://iip.sogang.ac.kr/)
## Reference
https://github.com/wil-j-wil/py_bankhttps://github.com/dgaspari/pyrapt
https://github.com/xiph/rnnoise
https://github.com/mozilla/LPCNet