Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/maheshj01/image-captioning-using-llava-and-llama3

lmage Caption Generator using llava and llama3 through the ollama library
https://github.com/maheshj01/image-captioning-using-llava-and-llama3

llama3 llava ollama vision

Last synced: 24 days ago
JSON representation

lmage Caption Generator using llava and llama3 through the ollama library

Awesome Lists containing this project

README

        

### Image Caption Generator

This project uses LLaVA (Large Language-and-Vision Assistant) , an end-to-end trained large multimodal model that connects a vision encoder and LLM for general-purpose visual and language understanding.

llava generates the description of the image and the description is the fed to llama3 to generate the caption of the image.

### Installation

1. Clone the repo

```sh
git clone
```

2. Activate a virtual env
```
python3 -m venv cenv
source cenv/bin/activate
```

3. Install requirements

```sh
pip install -r requirements.txt
```

4. Download the llms using the following command

```sh
ollama pull llama3
ollama pull llava
```

5. Start the local ollama server

```sh
ollama serve
```

6. Run the backend server

```sh
uvicorn main:app --reload
```

7. Run the code

```sh
streamlit run app.py
```