https://github.com/zvikinoza/masr

Mini Automatic Speech Recognition
https://github.com/zvikinoza/masr

fourier-transform keras-tensorflow librosa sound-processing speech-recognition speech-to-text

Last synced: over 1 year ago
JSON representation

Mini Automatic Speech Recognition

Host: GitHub
URL: https://github.com/zvikinoza/masr
Owner: zvikinoza
Created: 2020-01-04T12:45:50.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2022-09-26T19:29:50.000Z (almost 4 years ago)
Last Synced: 2025-01-26T15:16:56.561Z (over 1 year ago)
Topics: fourier-transform, keras-tensorflow, librosa, sound-processing, speech-recognition, speech-to-text
Language: Jupyter Notebook
Size: 84.1 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Mini Automatic Speech Recognition
## Using Fourier transforms and CNN (VGG like) architecture

![ ](https://github.com/zvikiNozadze/MASR/blob/master/imgs/number_spectogram.png)

EDA and modeling on ≈150 samples of sounds speaking numbers form 1 to 5 recorded by 10 people.

Accuracy on large (>10k) dataset: 94%.

Accuracy on given small dataset: 93%

Task was more or less chalenging because of small dataset.

Augmentation techniques:
* increase/decrease pitch
* increase/decrease speed
* stretching
* frequency and time masking
* white noice injection
* time shifting
* overlay 2 samples (quiet and louder)
* pre/post noise padding

Splitting data is done by speakers 5-5.

Training graph :

![ ]( https://github.com/zvikiNozadze/MASR/blob/master/imgs/acc_graph.png)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/zvikinoza/masr

Awesome Lists containing this project

README