https://github.com/uday160386/image-audio-hf-openai

generate a meaningful audio from uploaded photo using HuggingFace + Langchain+ Open AI
https://github.com/uday160386/image-audio-hf-openai

ai-ml huggingface-transformers langchain natural-language-processing openai python streamlit text-audio vuk-genai

Last synced: 4 months ago
JSON representation

generate a meaningful audio from uploaded photo using HuggingFace + Langchain+ Open AI

Host: GitHub
URL: https://github.com/uday160386/image-audio-hf-openai
Owner: uday160386
Created: 2024-05-20T15:01:18.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2024-05-20T15:13:02.000Z (about 1 year ago)
Last Synced: 2025-01-13T14:28:03.359Z (6 months ago)
Topics: ai-ml, huggingface-transformers, langchain, natural-language-processing, openai, python, streamlit, text-audio, vuk-genai
Language: Python
Homepage:
Size: 3.87 MB
Stars: 0
Watchers: 1
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Gen AI: generate a meaningful audio from uploaded photo using HuggingFace + Langchain+ Open AI

### Pre-requisites:
Install below libraries from requirements.txt file
```sh
pip install -r requirements.txt
```
## Design info:
- used hugging face to consume ready made AI models.
- for image-to-text with model as "(salesforce/blip-image-captioning-base)"
- for text to audio with model as "kan-bayashi_ljspeech_vits".
- used langchain+Chat GPT to geenrate a text
- published image to audio using streamlit

## Build and run?
streamlit run app.py
## Image to Audio:

![screenshot](./images/output.png)

Read more [here](./images/gen_audio_from_photo.flac)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/uday160386/image-audio-hf-openai

Awesome Lists containing this project

README