Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/arkapravo-ghosh/speech-to-text

Speech to Text Transcription using OpenAI Whisper v3 and FastAPI
https://github.com/arkapravo-ghosh/speech-to-text

ai fastapi huggingface machine-learning openai python3 speech-to-text transformers whisper

Last synced: about 1 month ago
JSON representation

Speech to Text Transcription using OpenAI Whisper v3 and FastAPI

Host: GitHub
URL: https://github.com/arkapravo-ghosh/speech-to-text
Owner: Arkapravo-Ghosh
License: agpl-3.0
Created: 2024-01-15T21:27:59.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2024-02-07T06:45:13.000Z (12 months ago)
Last Synced: 2024-11-03T16:25:48.780Z (3 months ago)
Topics: ai, fastapi, huggingface, machine-learning, openai, python3, speech-to-text, transformers, whisper
Language: Python
Homepage:
Size: 398 KB
Stars: 0
Watchers: 2
Forks: 3
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

# Speech-To-Text API

## Description

This is a Simple API built using Python and FastAPI that converts speech to text using OpenAI's Whisper v3 Model from HuggingFace Transformers.

## Installation

### Clone the repository and navigate to the directory

```bash
git clone https://github.com/Arkapravo-Ghosh/speech-to-text.git
```

```bash
cd speech-to-text
```

### Create a virtual environment

```bash
python -m venv .venv
```

### Activate the virtual environment

Windows

```pwsh
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
```

```pwsh
.\.venv\Scripts\activate.ps1
```

GNU/Linux or macOS

```bash
source .venv/bin/activate
```

### Upgrade pip and install the dependencies

```bash
python -m pip install -U pip
```

```bash
pip install -r requirements.txt
```

### Run the server

```bash
python main.py
```

## Usage

## Test with given sample audio file

```bash
curl -X POST -F "file=@./sample.webm" "http://localhost:5000/transcribe"
```

- `/transcribe` (POST) - Transcribes the audio file sent in the request body and returns the transcript as a JSON response.