Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/markovka17/dla
Deep learning for audio processing
https://github.com/markovka17/dla
deep-learning keyword-spotting signal-processing speaker-verification speech-recognition tts voice-conversion
Last synced: 21 days ago
JSON representation
Deep learning for audio processing
- Host: GitHub
- URL: https://github.com/markovka17/dla
- Owner: markovka17
- License: mit
- Created: 2020-08-23T13:07:56.000Z (about 4 years ago)
- Default Branch: 2024
- Last Pushed: 2024-10-20T18:01:23.000Z (23 days ago)
- Last Synced: 2024-10-20T21:55:33.675Z (22 days ago)
- Topics: deep-learning, keyword-spotting, signal-processing, speaker-verification, speech-recognition, tts, voice-conversion
- Language: Jupyter Notebook
- Homepage:
- Size: 60.4 MB
- Stars: 579
- Watchers: 24
- Forks: 103
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
![logo5v1](https://user-images.githubusercontent.com/20357655/104316876-2be04600-54ee-11eb-93ed-f9835fde1527.jpg)
# Deep Learning for Audio (DLA)
- Lecture and seminar materials for each week are in `./week*` folders, see `README.md` for materials and instructions
- Any technical issues, ideas, bugs in course materials, contribution ideas - add an issue
- The current version of the course is conducted in **autumn 2024** at the [CS Faculty](https://cs.hse.ru/en/) of [HSE](https://www.hse.ru/en/).For previous years versions, see [Past Versions](#past-versions) section.
# Syllabus
- [**week01**](./week01) Introduction to Course
- Lecture: Introduction to Course
- Seminar: Experiment tracking, `Hydra`, `Git`, `VS code`
- Self-Study: Introduction to `PyTorch`- [**week02**](./week02) Introduction to Digital Signal Processing
- Lecture: Signals, Fourier Transform, spectrograms, MelScale, MFCC
- Seminar: DSP in practice, spectrogram creation, IRF, frequency filtering- [**week03**](./week03) Speech Recognition I
- Lecture: Metrics, Datasets, Connectionist Temporal Classification (CTC), Classic Models, Beam Search, Language models
- Seminar: Audio Augmentations, Beam Search
- Q&A Session: Homework discussion, R&D coding tips- [**week04**](./week04) Speech Recognition II
- Lecture: LAS, RNN-T, Language models for RNN-T and LAS
- Seminar: Hybrid RNN-T and CTC model training and inference- [**week05**](./week05) Guest Lecture. Speech Recognition III and Audio SSL
- Lecture: Self-Supervised Models for Audio, Audio LLMs
- [**week06**](./week06) Source Separation I
- Lecture: A review of general Source Separation and Denoising, Encoder-Decoder-Separator architectures, Demucs family, DCCRN, FullSubNet+, BandSplitRNN
- Seminar: Metrics- [**week07**](./week07) Source Separation II
- Lecture: Speech separation, Blind and Target Separation, Recurrent(TasNet, DPRNN, VoiceFilter) and CNN(ConvTasNet, SpEx+)
- Seminar: WienerFilter, SincFilter and DEMUCS; streaming processing and performance metrics# Homeworks and Projects
- [**HW_ASR**](./hw1_asr) Training speech recognition model
- [**Project_AVSS**](./project_avss) Training audio-visual speech separation modelSee our [project template](https://github.com/Blinorot/pytorch_project_template).
# Resources
- [Lecture recordings on YouTube (in russian)](https://youtube.com/playlist?list=PLYG3WHDP5CWVRxLjXZbllqIQTWY_QjKmz)
Some of the weeks have English recordings. See the corresponding sub-directories.
# Contributors & course staff
Course materials and teaching (in different years) were delivered by:
- [Maxim Kaledin](https://t.me/XuMuK_MK)
- [Petr Grinberg](https://t.me/Blinorot)
- [Grigory Fedorov](https://t.me/fedorovgv)
- [Aibek Alanov](https://t.me/aibrain)
- [Alexander Markovich (previously)](https://t.me/markovka17)
- [Daniil Ivanov (previously)](https://t.me/the_longest_id_in_the_world)
- [Ilya Lewin (previously)](https://t.me/levensons)
- [Timofey Smirnov (previously)](https://t.me/timothyxp)
- [Alexander Mamaev (previously)](https://t.me/alxmamaev)# Past Versions
- [2023](https://github.com/markovka17/dla/tree/2023)
- [2022](https://github.com/markovka17/dla/tree/2022)
- [2021](https://github.com/markovka17/dla/tree/2021)
- [2020](https://github.com/markovka17/dla/tree/2020)