Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/uday160386/image-audio-hf-openai
generate a meaningful audio from uploaded photo using HuggingFace + Langchain+ Open AI
https://github.com/uday160386/image-audio-hf-openai
huggingface-transformers langchain openai text-audio
Last synced: about 2 months ago
JSON representation
generate a meaningful audio from uploaded photo using HuggingFace + Langchain+ Open AI
- Host: GitHub
- URL: https://github.com/uday160386/image-audio-hf-openai
- Owner: uday160386
- Created: 2024-05-20T15:01:18.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2024-05-20T15:13:02.000Z (8 months ago)
- Last Synced: 2024-05-20T17:32:04.286Z (8 months ago)
- Topics: huggingface-transformers, langchain, openai, text-audio
- Language: Python
- Homepage:
- Size: 3.87 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Gen AI: generate a meaningful audio from uploaded photo using HuggingFace + Langchain+ Open AI
### Pre-requisites:
Install below libraries from requirements.txt file
```sh
pip install -r requirements.txt
```
## Design info:
- used hugging face to consume ready made AI models.
- for image-to-text with model as "(salesforce/blip-image-captioning-base)"
- for text to audio with model as "kan-bayashi_ljspeech_vits".
- used langchain+Chat GPT to geenrate a text
- published image to audio using streamlit
## Build and run?
streamlit run app.py
## Image to Audio:![screenshot](./images/output.png)
Read more [here](./images/gen_audio_from_photo.flac)