https://github.com/maheshj01/image-captioning-using-llava-and-llama3

lmage Caption Generator using llava and llama3 through the ollama library
https://github.com/maheshj01/image-captioning-using-llava-and-llama3

llama3 llava ollama vision

Last synced: 7 months ago
JSON representation

lmage Caption Generator using llava and llama3 through the ollama library

Host: GitHub
URL: https://github.com/maheshj01/image-captioning-using-llava-and-llama3
Owner: maheshj01
Created: 2024-04-23T01:20:54.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2024-06-01T21:55:51.000Z (about 1 year ago)
Last Synced: 2024-11-27T08:45:59.759Z (7 months ago)
Topics: llama3, llava, ollama, vision
Language: Python
Homepage: https://ai-caption.streamlit.app/
Size: 8.79 KB
Stars: 2
Watchers: 2
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: readme.md

Awesome Lists containing this project

README

### Image Caption Generator

This project uses LLaVA (Large Language-and-Vision Assistant) , an end-to-end trained large multimodal model that connects a vision encoder and LLM for general-purpose visual and language understanding.

llava generates the description of the image and the description is the fed to llama3 to generate the caption of the image.

### Installation

1. Clone the repo

```sh
git clone
```

2. Activate a virtual env
```
python3 -m venv cenv
source cenv/bin/activate
```

3. Install requirements

```sh
pip install -r requirements.txt
```

4. Download the llms using the following command

```sh
ollama pull llama3
ollama pull llava
```

5. Start the local ollama server

```sh
ollama serve
```

6. Run the backend server

```sh
uvicorn main:app --reload
```

7. Run the code

```sh
streamlit run app.py
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/maheshj01/image-captioning-using-llava-and-llama3

Awesome Lists containing this project

README