https://github.com/ragibson/mfcc-speech-recognition
Real-time speech recognition via "Mel-Frequency Cepstral Coefficients" neural networks.
https://github.com/ragibson/mfcc-speech-recognition
machine-learning mfcc neural-network pytorch real-time speech-recognition
Last synced: 11 months ago
JSON representation
Real-time speech recognition via "Mel-Frequency Cepstral Coefficients" neural networks.
- Host: GitHub
- URL: https://github.com/ragibson/mfcc-speech-recognition
- Owner: ragibson
- License: mit
- Created: 2019-05-22T22:36:09.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2019-05-23T21:48:30.000Z (about 7 years ago)
- Last Synced: 2025-07-05T07:07:06.321Z (11 months ago)
- Topics: machine-learning, mfcc, neural-network, pytorch, real-time, speech-recognition
- Language: Jupyter Notebook
- Homepage:
- Size: 1.05 MB
- Stars: 7
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# MFCC-speech-recognition
This repository contains an easy-to-train machine learning architecture that
can recognize speech commands on low-end, commodity hardware in real-time.
Specifically, the architecture uses "Mel-frequency cepstral coefficients" as
input features to a small neural network, achieving "near state-of-the-art"
classification accuracy.
Importantly, this implementation has an inference time of ~10 microseconds on
a desktop CPU for 0.1 s of input sound. In other words, it could run in
real-time on systems up to 10,000x slower than our desktop CPU.
A more comprehensive description of the architecture and its performance can
be read [here](report/DeepHark.pdf).
This project was originally hosted
[here](https://github.com/deephark/DeepHark).