https://github.com/noxs1d/speech-to-text
🤖ML project which record audio and converts it to text
https://github.com/noxs1d/speech-to-text
ai fastapi ml torch whisper
Last synced: 3 months ago
JSON representation
🤖ML project which record audio and converts it to text
- Host: GitHub
- URL: https://github.com/noxs1d/speech-to-text
- Owner: noxs1d
- Created: 2024-11-30T21:04:21.000Z (over 1 year ago)
- Default Branch: Release1.0
- Last Pushed: 2025-02-05T18:56:18.000Z (over 1 year ago)
- Last Synced: 2025-04-01T04:38:50.494Z (about 1 year ago)
- Topics: ai, fastapi, ml, torch, whisper
- Language: Jupyter Notebook
- Homepage:
- Size: 5.91 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Speech-to-Text System
___
This project is a Speech-to-Text application built with Python, designed to process audio files, perform transcription, and support functionalities such as recording, uploading, playing audio, and converting it to text.

## Features
1. Record Audio: Users can record audio directly through the interface.
2. Upload Audio File: Users can upload an existing audio file for processing.
3. Play Uploaded Audio: The system allows users to listen to uploaded audio files.
4. Convert Audio to Text: Audio files are transcribed into text using state-of-the-art machine learning models.
## Technologies Used
This project leverages the following frameworks and libraries:
- FastAPI: For building a fast, modern, and asynchronous web API.
- Whisper: OpenAI's speech recognition model for transcription.
- Torch (PyTorch): For loading and running the Whisper model efficiently.
- Wave: For audio file manipulation and playback.