https://github.com/alireza-akhavan/audio-recognition
Deep learning tutorials about audio data
https://github.com/alireza-akhavan/audio-recognition
Last synced: 6 months ago
JSON representation
Deep learning tutorials about audio data
- Host: GitHub
- URL: https://github.com/alireza-akhavan/audio-recognition
- Owner: Alireza-Akhavan
- Created: 2022-06-16T14:34:04.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2022-06-23T15:53:48.000Z (over 3 years ago)
- Last Synced: 2025-03-26T21:33:30.137Z (7 months ago)
- Size: 1.95 KB
- Stars: 6
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## Important papers and implementation
A. Graves, A. Mohamed, and G. Hinton, “Speech recognition with
deep recurrent neural networks,” in Proc. ICASSP, Vancouver, 2013.J. Chorowski, D. Bahdanau, D. Serdyuk, K. Cho, and Y. Bengio,
“Attention-based models for speech recognition,” in Proc. NIPS, 2015.### Deepcpeech2 (ICLR 2016)
- [Paper](https://arxiv.org/abs/1512.02595)
- [Official Code in Paddle(Baidu framework)](https://github.com/PaddlePaddle/PaddleSpeech)
- [Tensorflow Deepcpeech2](https://github.com/tensorflow/models/blob/238922e98dd0e8254b5c0921b241a1f5a151782f/research/deep_speech)
- [Simplified tutorial in keras.io](https://keras.io/examples/audio/ctc_asr/)
#### notes:
- "SortaGrad”: order utterances by length during first epoch.
- "Batchnorm"
- Using CTC loss