Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/maheshj01/image-captioning-using-llava-and-llama3
lmage Caption Generator using llava and llama3 through the ollama library
https://github.com/maheshj01/image-captioning-using-llava-and-llama3
llama3 llava ollama vision
Last synced: 24 days ago
JSON representation
lmage Caption Generator using llava and llama3 through the ollama library
- Host: GitHub
- URL: https://github.com/maheshj01/image-captioning-using-llava-and-llama3
- Owner: maheshj01
- Created: 2024-04-23T01:20:54.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2024-06-01T21:55:51.000Z (7 months ago)
- Last Synced: 2024-11-27T08:45:59.759Z (25 days ago)
- Topics: llama3, llava, ollama, vision
- Language: Python
- Homepage: https://ai-caption.streamlit.app/
- Size: 8.79 KB
- Stars: 2
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
Awesome Lists containing this project
README
### Image Caption Generator
This project uses LLaVA (Large Language-and-Vision Assistant) , an end-to-end trained large multimodal model that connects a vision encoder and LLM for general-purpose visual and language understanding.
llava generates the description of the image and the description is the fed to llama3 to generate the caption of the image.
### Installation
1. Clone the repo
```sh
git clone
```2. Activate a virtual env
```
python3 -m venv cenv
source cenv/bin/activate
```3. Install requirements
```sh
pip install -r requirements.txt
```4. Download the llms using the following command
```sh
ollama pull llama3
ollama pull llava
```5. Start the local ollama server
```sh
ollama serve
```6. Run the backend server
```sh
uvicorn main:app --reload
```7. Run the code
```sh
streamlit run app.py
```