Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/savka777/photo-translator
Real-Time Photo Describer and Translator with Speech Output
https://github.com/savka777/photo-translator
Last synced: about 1 month ago
JSON representation
Real-Time Photo Describer and Translator with Speech Output
- Host: GitHub
- URL: https://github.com/savka777/photo-translator
- Owner: savka777
- Created: 2024-02-28T14:35:59.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2024-02-28T14:44:56.000Z (10 months ago)
- Last Synced: 2024-02-28T15:58:00.259Z (10 months ago)
- Language: Python
- Size: 3.91 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Photo Translator
Real-Time Photo Describer and Translator with Speech OutputThis project is designed to capture an image from a camera, translate the main object in the image into a specified language using GPT-4V, and then generate an audio output of the translated text using ElevenLabs' API.
## Features
- **Image Capture**: Uses OpenCV to capture an image from the webcam.
- **Image Processing**: Converts the captured image to a base64 encoded string for processing.
- **Translation**: Sends the encoded image to OpenAI's GPT-4V, requesting a description of the central object in the specified language.
- **Audio Output**: Utilizes ElevenLabs API to convert the translated text into spoken audio.You need to have API keys for OpenAI and ElevenLabs
You need to install the required packages (pip install opencv-python-headless pillow requests elevenlabs)