Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/arkapravo-ghosh/speech-to-text
Speech to Text Transcription using OpenAI Whisper v3 and FastAPI
https://github.com/arkapravo-ghosh/speech-to-text
ai fastapi huggingface machine-learning openai python3 speech-to-text transformers whisper
Last synced: about 1 month ago
JSON representation
Speech to Text Transcription using OpenAI Whisper v3 and FastAPI
- Host: GitHub
- URL: https://github.com/arkapravo-ghosh/speech-to-text
- Owner: Arkapravo-Ghosh
- License: agpl-3.0
- Created: 2024-01-15T21:27:59.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-02-07T06:45:13.000Z (12 months ago)
- Last Synced: 2024-11-03T16:25:48.780Z (3 months ago)
- Topics: ai, fastapi, huggingface, machine-learning, openai, python3, speech-to-text, transformers, whisper
- Language: Python
- Homepage:
- Size: 398 KB
- Stars: 0
- Watchers: 2
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Speech-To-Text API
## Description
This is a Simple API built using Python and FastAPI that converts speech to text using OpenAI's Whisper v3 Model from HuggingFace Transformers.
## Installation
### Clone the repository and navigate to the directory
```bash
git clone https://github.com/Arkapravo-Ghosh/speech-to-text.git
``````bash
cd speech-to-text
```### Create a virtual environment
```bash
python -m venv .venv
```### Activate the virtual environment
Windows
```pwsh
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
``````pwsh
.\.venv\Scripts\activate.ps1
```GNU/Linux or macOS
```bash
source .venv/bin/activate
```### Upgrade pip and install the dependencies
```bash
python -m pip install -U pip
``````bash
pip install -r requirements.txt
```### Run the server
```bash
python main.py
```## Usage
## Test with given sample audio file
```bash
curl -X POST -F "file=@./sample.webm" "http://localhost:5000/transcribe"
```- `/transcribe` (POST) - Transcribes the audio file sent in the request body and returns the transcript as a JSON response.