https://github.com/ragibson/mfcc-speech-recognition

Real-time speech recognition via "Mel-Frequency Cepstral Coefficients" neural networks.
https://github.com/ragibson/mfcc-speech-recognition

machine-learning mfcc neural-network pytorch real-time speech-recognition

Last synced: 11 months ago
JSON representation

Real-time speech recognition via "Mel-Frequency Cepstral Coefficients" neural networks.

Host: GitHub
URL: https://github.com/ragibson/mfcc-speech-recognition
Owner: ragibson
License: mit
Created: 2019-05-22T22:36:09.000Z (about 7 years ago)
Default Branch: master
Last Pushed: 2019-05-23T21:48:30.000Z (about 7 years ago)
Last Synced: 2025-07-05T07:07:06.321Z (11 months ago)
Topics: machine-learning, mfcc, neural-network, pytorch, real-time, speech-recognition
Language: Jupyter Notebook
Homepage:
Size: 1.05 MB
Stars: 7
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# MFCC-speech-recognition

This repository contains an easy-to-train machine learning architecture that
can recognize speech commands on low-end, commodity hardware in real-time.

Specifically, the architecture uses "Mel-frequency cepstral coefficients" as
input features to a small neural network, achieving "near state-of-the-art"
classification accuracy.

Importantly, this implementation has an inference time of ~10 microseconds on
a desktop CPU for 0.1 s of input sound. In other words, it could run in
real-time on systems up to 10,000x slower than our desktop CPU.

A more comprehensive description of the architecture and its performance can
be read [here](report/DeepHark.pdf).

This project was originally hosted
[here](https://github.com/deephark/DeepHark).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ragibson/mfcc-speech-recognition

Awesome Lists containing this project

README