Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ashot72/answering-questions-about-images
You can upload images, ask questions about images using voice prompts, then listen to the responses in voice
https://github.com/ashot72/answering-questions-about-images
answering-questions blip-2-ai-model gtts large-language-models llm replicate speech-to-text text-to-speech whisper
Last synced: 4 days ago
JSON representation
You can upload images, ask questions about images using voice prompts, then listen to the responses in voice
- Host: GitHub
- URL: https://github.com/ashot72/answering-questions-about-images
- Owner: Ashot72
- Created: 2023-05-31T12:59:56.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-05-31T13:26:41.000Z (over 1 year ago)
- Last Synced: 2024-11-08T03:23:41.354Z (about 2 months ago)
- Topics: answering-questions, blip-2-ai-model, gtts, large-language-models, llm, replicate, speech-to-text, text-to-speech, whisper
- Language: JavaScript
- Homepage:
- Size: 1.31 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Answering Questions About Images
This is a Node.js app where you can upload images, ask questions about images using voice prompts, then listen to the responses in voice.
Voice to Text: I turn an audio into text using [Whisper](https://openai.com/research/whisper) which is an OpenAI Speech Recognition Model that turns audio
into text with up to 99% accuracy. Whisper is a speech transcription system form the creators of ChatGPT. Anyone can use it, and it is completely free. The system is trained on 680 000 hours of speech data from the network and recognizes 99 languages.Generating Answers: We use [blip-2](https://replicate.com/andreasjansson/blip-2) model that answers questions about images.
Text to Voice: I use [gTTS.js](https://www.npmjs.com/package/gtts) which is Google Text to Speech JavaScript library originally written in Phyton.
To get started.
```
Clone the repositorygit clone https://github.com/Ashot72/Answering-Questions-About-Images
cd Answering-Questions-About-ImagesAdd your key to .env file
# installs dependencies
npm install# to run locally
npm start
```Go to [Answering Questions About Images Video](https://youtu.be/6w_F1GARGDQ) page
Go to [Answering Questions About Images Description](https://ashot72.github.io/Answering-Questions-About-Images/docs.html) page