An open API service indexing awesome lists of open source software.

https://github.com/zvikinoza/masr

Mini Automatic Speech Recognition
https://github.com/zvikinoza/masr

fourier-transform keras-tensorflow librosa sound-processing speech-recognition speech-to-text

Last synced: over 1 year ago
JSON representation

Mini Automatic Speech Recognition

Awesome Lists containing this project

README

          

# Mini Automatic Speech Recognition
## Using Fourier transforms and CNN (VGG like) architecture

![ ](https://github.com/zvikiNozadze/MASR/blob/master/imgs/number_spectogram.png)

EDA and modeling on ≈150 samples of sounds speaking numbers form 1 to 5 recorded by 10 people.



Accuracy on large (>10k) dataset: 94%.

Accuracy on given small dataset: 93%


Task was more or less chalenging because of small dataset.


Augmentation techniques:
* increase/decrease pitch
* increase/decrease speed
* stretching
* frequency and time masking
* white noice injection
* time shifting
* overlay 2 samples (quiet and louder)
* pre/post noise padding


Splitting data is done by speakers 5-5.




Training graph :

![ ]( https://github.com/zvikiNozadze/MASR/blob/master/imgs/acc_graph.png)