https://github.com/ozturkvedat/speechprocessing_mobile
FastAPI for speech recognition, transcription and TTS using Faster-Whisper and gTTS. Dockerfile and mobile app implementation is included.
https://github.com/ozturkvedat/speechprocessing_mobile
docker expo fastapi gtts react-native speech-recognition text-to-speech whisper-ai
Last synced: over 1 year ago
JSON representation
FastAPI for speech recognition, transcription and TTS using Faster-Whisper and gTTS. Dockerfile and mobile app implementation is included.
- Host: GitHub
- URL: https://github.com/ozturkvedat/speechprocessing_mobile
- Owner: OzturkVedat
- Created: 2024-09-26T09:41:49.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2025-03-04T06:08:12.000Z (over 1 year ago)
- Last Synced: 2025-03-04T07:22:24.561Z (over 1 year ago)
- Topics: docker, expo, fastapi, gtts, react-native, speech-recognition, text-to-speech, whisper-ai
- Language: JavaScript
- Homepage:
- Size: 740 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Speech Processing Mobile App
This project is designed to provide efficient audio transcription, text-to-speech (TTS), speech recognition and translation through a Dockerized API, using gTTS, Faster-Whisper model and FastAPI. It also includes a mobile app built with React Native as an example implementation.
## Features
- Transcription, Speech Recognition and Translation using Faster-Whisper
- Text-to-Speech with gTTS and langid
- Dockerized API
- React Native Expo App
## Requirements
- Docker Engine
- CUDA supported hardware
- npm and expo (for mobile app)
## Setup
### Backend (Docker)
To run the FastAPI in a Docker container:
1. Navigate to the docker/ folder and build the Docker image (can take some time):
```bash
cd docker
docker build -t speech-proccesing-api .
```
2. Run the Docker container:
```bash
docker run --gpus all -p 8000:8000 speech-proccesing-api
```
Wait for microservice to initialize, it can take a few minutes. FYI, you can check the container's logs within Docker Desktop to supervise the process.
3. Access the API:
Open your browser and go to http://localhost:8000/ to view the API Documentation(SwaggerUI).
### Mobile App (expo)
1. Navigate to the mobile-app/ folder and install the dependencies:
```bash
cd mobile-app
npx expo install
```
2. Connect the Dockerized API:
Navigate to the app.json file and modify the API_URL to your network's IP. (E.g. container can take the IP of the connected Wifi by default, which mobile device should be connected too, in development.)
3. Run the app:
```bash
npx expo start
```
You can open the app with expo app on Android, or camera app on iOS. Home screen is given below:
## Usage
Once both the backend and mobile app are running, you can interact with the mobile app to:
- Transcribe audio files with timestamps
- Record speech for transcription and translation
- Convert text into speech
Response time can vary depending on the system, particulary GPU.

## Contributing
Feel free to open issues or submit pull requests for improvements.