https://github.com/eroydev/speech2text-api
https://github.com/eroydev/speech2text-api
Last synced: 11 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/eroydev/speech2text-api
- Owner: ERoydev
- Created: 2025-02-13T19:58:06.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-02-13T20:06:48.000Z (over 1 year ago)
- Last Synced: 2025-02-13T20:37:36.225Z (over 1 year ago)
- Language: Python
- Size: 0 Bytes
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Speech2Text-API
WhisperTranscriber is a lightweight microservice that leverages [faster_whisper](https://github.com/guillaumekln/faster-whisper) to convert speech to text. Built with FastAPI and pydub, it provides a simple REST API endpoint for uploading audio files and receiving transcriptions.
## Features
- **Speech-to-Text Transcription:** Uses faster_whisper to transcribe `.wav` audio files.
- **REST API:** A POST endpoint to upload audio files and get transcriptions.
- **Temporary File Handling:** Uses Python's `tempfile` to manage audio files during processing.
- **Asynchronous Processing:** Built on FastAPI to handle concurrent requests.
## Requirements
- Python 3.8+
- [FastAPI](https://fastapi.tiangolo.com/)
- [Uvicorn](https://www.uvicorn.org/)
- [faster_whisper](https://github.com/guillaumekln/faster-whisper)
- [pydub](https://github.com/jiaaro/pydub)
- [ffmpeg](https://ffmpeg.org/) (required by pydub)
## Installation
1. **Clone the repository:**
```bash
git clone https://github.com/yourusername/whisper-transcriber.git
cd whisper-transcriber
2. **Examples:**
- Create a POST request using whatever http client you want and make a request to this url /audio_transcription
- You need to have audio.wav file and create multipart/formdata
```py
with requests.Session() as session: # THIS IS REQUEST TO MY MICROSERVICE TO GET THE TRANSCIPTION => TODO: IMPLEMENT ASYNCHRONOUS BEHAVIOUR
files = {'file': (audio_file.name, audio_file, audio_file.content_type)} # I create multipart/formdata
response = session.post('http://127.0.0.1:9000/audio_transcription', files=files)
You need to use ur server url and post request to /audio_transcription
```
The microservice response is:
```py
{
"transcription_text": " Hello, I want to test the functionality of my backend. I don't know why I am getting these errors.",
"transcribed_audio": {
"text": " Hello, I want to test the functionality of my backend. I don't know why I am getting these errors.",
"segments": [],
"language": "en"
},
"audio_duration": 7.32
}
```