Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/datascience-py/speechrecognition
This repo create STT (ASR) models for you can use pre-training models or training self model
https://github.com/datascience-py/speechrecognition
asr machine-learning matplotlib microphone models neural-network numpy python3 pytorch sklearn speech-recognition speech-to-text stt tensorflow-lite tensorflow2 web
Last synced: 29 days ago
JSON representation
This repo create STT (ASR) models for you can use pre-training models or training self model
- Host: GitHub
- URL: https://github.com/datascience-py/speechrecognition
- Owner: DataScience-py
- License: mit
- Created: 2024-10-24T09:37:41.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2024-12-01T09:35:45.000Z (about 2 months ago)
- Last Synced: 2024-12-19T02:23:58.894Z (29 days ago)
- Topics: asr, machine-learning, matplotlib, microphone, models, neural-network, numpy, python3, pytorch, sklearn, speech-recognition, speech-to-text, stt, tensorflow-lite, tensorflow2, web
- Language: Python
- Homepage:
- Size: 42 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Speech Recognition System
You can download the model from [Hugging Face Model Hub](https://huggingface.co/DataScience-py/SpeechRecognitionCNNGRUModel)
[first model_scripted.pt save with TorchScript](https://huggingface.co/DataScience-py/SpeechRecognitionCNNGRUModel/blob/main/model_scripted.pt)
## Description
This project is focused on speech recognition using modern machine learning and natural language processing techniques. The primary goal is to provide an accurate and efficient system that converts speech data into text.
It is useful for applications requiring voice input, such as voice assistants, automated response systems, and other related use cases.
## Technologies
- Python 3.x
- pytorch## Features
- Speech recognition from various sources (microphone, audio files) (add soon)
- Multi-language support (add soon)
- Noise reduction implementation for improved recognition quality (add soon)
- Create yourself model (add soon)## Usage
### pre-train model
add soon
### model
add soon
## Requirements
- Python 3.8+
- Dependencies listed in requirements.txt## Completed Model
|**model** |**complete** |CTCLOSS |**CER** |**WER** |**Additional training is needed**|**DATASET** |
|---------------|------------------|---------|----------|----------|---------------------------------|------------|
|**RNN MODEL** |:heavy_check_mark:|0.7428 |0.235044 |0.7466 |True |COMMONVOICE |
|**TRANSFORMER**| ✗ |**None** |**None** |**None** |**None** |**None** |