https://github.com/wenet-e2e/nn-singal-processing-papers

List of NN based singal processing papers
https://github.com/wenet-e2e/nn-singal-processing-papers

Last synced: 4 months ago
JSON representation

List of NN based singal processing papers

Host: GitHub
URL: https://github.com/wenet-e2e/nn-singal-processing-papers
Owner: wenet-e2e
License: apache-2.0
Created: 2023-06-02T14:06:22.000Z (about 3 years ago)
Default Branch: main
Last Pushed: 2023-06-05T05:48:05.000Z (about 3 years ago)
Last Synced: 2025-10-23T04:56:03.147Z (8 months ago)
Size: 8.79 KB
Stars: 21
Watchers: 2
Forks: 1
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # nn-singal-processing-papers

List of NN based singal processing papers

## Adaptive Noise Suppression (Speech Enhancement)

### Time-Frequency Domain

- DCUnet: [Phase-aware speech enhancement with Deep Complex U-Net](https://openreview.net/pdf?id=SkeRTsAcYm) (SNU, ICLR, 2019)

- DCCRN

    - DCCRN: [DCCRN: Deep Complex Convolution Recurrent Network for Phase-Aware Speech Enhancement](https://arxiv.org/pdf/2008.00264.pdf) (NWPU, 2020)

    - DCCRN+: [DCCRN+: Channel-wise Subband DCCRN with SNR Estimation for Speech Enhancement](https://arxiv.org/pdf/2106.08672.pdf) (NWPU, 2021)

    - S-DCCRN: [S-DCCRN: Super Wide Band DCCRN with learnable complex feature for speech enhancement](https://arxiv.org/pdf/2111.08387.pdf) (NWPU, 2022)

    - Spatial-DCCRN: [SPATIAL-DCCRN: DCCRN EQUIPPED WITH FRAME-LEVEL ANGLE FEATURE AND HYBRID FILTERING FOR MULTI-CHANNEL SPEECH ENHANCEMENT](https://arxiv.org/pdf/2210.08802.pdf) (NWPU, 2022)

- DesNet: [DESNET: A MULTI-CHANNEL NETWORK FOR SIMULTANEOUS SPEECH DEREVERBERATION, ENHANCEMENT AND SEPARATION](https://arxiv.org/pdf/2011.02131.pdf) (NWPU, 2020)

- PHASEN: [PHASEN: A Phase-and-Harmonics-Aware Speech Enhancement Network](https://arxiv.org/pdf/1911.04697.pdf) (USTC, AAAI, 2020)

- DPCRN: [DPCRN: Dual-Path Convolution Recurrent Network for Single Channel Speech Enhancement](https://arxiv.org/pdf/2107.05429.pdf) (NJU, Interspeech, 2021)

- BSRNN: [High Fidelity Speech Enhancement with Band-split RNN](https://arxiv.org/pdf/2212.00406.pdf)(Tencent, 2022)

- UFormer: [Uformer: A Unet based dilated complex & real dual-path conformer network for simultaneous speech enhancement and dereverberation](https://arxiv.org/pdf/2111.06015.pdf) (NWPU, 2022)

### Time Domain

- WaveUnet: [Wave-U-Net: A Multi-Scale Neural Network for End-to-End Audio Source Separation](https://arxiv.org/pdf/1806.03185.pdf) (QMUL,ISMIR, 2018)

- TCNN: [TCNN: TEMPORAL CONVOLUTIONAL NEURAL NETWORK FOR REAL-TIME SPEECH ENHANCEMENT IN THE TIME DOMAIN](https://web.cse.ohio-state.edu/~wang.77/papers/Pandey-Wang1.icassp19.pdf) (Ohio, ICASSP, 2019)

- DP-SARNN: [Dual-path Self-Attention RNN for Real-Time Speech Enhancement](https://arxiv.org/pdf/2010.12713.pdf) (Ohio, 2021)

## Acoustic Echo Cancellation

- wRLS-DFSMN: [Weighted Recursive Least Square Filter and Neural Network based Residual Echo Suppression for the AEC-Challenge](https://arxiv.org/pdf/2102.08551.pdf) (Alibaba, ICASSP, 2021)

- GCCRN [Acoustic Echo Cancellation using Deep Complex Neural Network with Nonlinear Magnitude Compression and Phase Information](https://www.isca-speech.org/archive/pdfs/interspeech_2021/peng21f_interspeech.pdf) (IACAS, Interspeech, 2021)

## Automatic Gain Control

## Speech Seperation

TODO: add important models from ESPnet and asteroid.

### Single Channel

### Multiple Channel

## Joint Optimization

- [Deep Learning for Joint Acoustic Echo and Noise Cancellation with Nonlinear Distortions](https://www.isca-speech.org/archive_v0/Interspeech_2019/pdfs/2651.pdf) (Ohio, Interspeech, 2019)

- [Low-Complexity, Real-Time Joint Neural Echo Control and Speech Enhancement Based On PercepNet](https://arxiv.org/pdf/2102.05245.pdf) (Amazon, ICASSP, 2021)

- NN3A: [NN3A: Neural Network supported Acoustic Echo Cancellation, Noise Suppression and Automatic Gain Control for Real-Time Communications](https://arxiv.org/pdf/2110.08437.pdf) (Alibaba, ICASSP, 2022)

## Masking

- IBM: [On Ideal Binary Mask As the Computational Goal of Auditory Scene Analysis](https://web.cse.ohio-state.edu/~wang.77/papers/Wang05.pdf) (Ohio, 2005)

- IRM: [Ideal ratio mask estimation using deep neural networks for robust speech recognition](https://web.cse.ohio-state.edu/~wang.77/papers/Narayanan-Wang2.icassp13.pdf) (Ohio, 2013)

- PSM: [Phase-Sensitive and Recognition-Boosted Speech Separation using Deep Recurrent Neural Networks](https://www.researchgate.net/profile/Jonathan-Le-Roux/publication/308836384_Phase-Sensitive_and_Recognition-Boosted_Speech_Separation_using_Deep_Recurrent_Neural_Networks/links/58bde44545851591c5e9badb/Phase-Sensitive-and-Recognition-Boosted-Speech-Separation-using-Deep-Recurrent-Neural-Networks.pdf) (Microsoft, 2015)

- CRM: [Complex Ratio Masking for Monaural Speech Separation](http://web.cse.ohio-state.edu/~wang.77/papers/WWW.taslp16.pdf) (Ohio, 2015)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/wenet-e2e/nn-singal-processing-papers

Awesome Lists containing this project

README