Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/savka777/photo-translator

Real-Time Photo Describer and Translator with Speech Output
https://github.com/savka777/photo-translator

Last synced: about 1 month ago
JSON representation

Real-Time Photo Describer and Translator with Speech Output

Host: GitHub
URL: https://github.com/savka777/photo-translator
Owner: savka777
Created: 2024-02-28T14:35:59.000Z (10 months ago)
Default Branch: main
Last Pushed: 2024-02-28T14:44:56.000Z (10 months ago)
Last Synced: 2024-02-28T15:58:00.259Z (10 months ago)
Language: Python
Size: 3.91 KB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Photo Translator
Real-Time Photo Describer and Translator with Speech Output

This project is designed to capture an image from a camera, translate the main object in the image into a specified language using GPT-4V, and then generate an audio output of the translated text using ElevenLabs' API.

## Features

- **Image Capture**: Uses OpenCV to capture an image from the webcam.
- **Image Processing**: Converts the captured image to a base64 encoded string for processing.
- **Translation**: Sends the encoded image to OpenAI's GPT-4V, requesting a description of the central object in the specified language.
- **Audio Output**: Utilizes ElevenLabs API to convert the translated text into spoken audio.

You need to have API keys for OpenAI and ElevenLabs
You need to install the required packages (pip install opencv-python-headless pillow requests elevenlabs)