https://github.com/aditya-ranjan1234/imagecaption-api

An API for generating captions, music genres, and privacy analysis based on images using FastAPI and LlamaIndex.
https://github.com/aditya-ranjan1234/imagecaption-api

api api-rest generative-ai llama-index

Last synced: 7 months ago
JSON representation

An API for generating captions, music genres, and privacy analysis based on images using FastAPI and LlamaIndex.

Host: GitHub
URL: https://github.com/aditya-ranjan1234/imagecaption-api
Owner: Aditya-Ranjan1234
License: gpl-3.0
Created: 2024-11-21T12:22:03.000Z (11 months ago)
Default Branch: main
Last Pushed: 2024-11-21T19:46:48.000Z (11 months ago)
Last Synced: 2025-01-22T20:14:27.425Z (9 months ago)
Topics: api, api-rest, generative-ai, llama-index
Language: Python
Homepage:
Size: 47.9 KB
Stars: 1
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# ImageCaption-API

## Image Caption and Music Genre Generation API

This API analyzes images to generate captions, identify potential privacy risks, and recommend music genres based on the visual content of the image. It leverages FastAPI, LlamaIndex, and multimodal AI models to perform these tasks.

## Features:
**Image Caption Generation**: Generates a high-level description of the image.
**Music Genre Recommendation**: Suggests a fitting music genre based on the image content.
**Privacy Analysis**: Identifies potential privacy or security risks in the image.

## Technologies:
**FastAPI** for the API backend
**LlamaIndex** for multimodal AI image processing
**Ollama** models for privacy and music genre generation

## Steps to Run Locally

### 1. Clone the Repository:
```bash
git clone https://github.com/Aditya-Ranjan1234/ImageCaption-API.git
cd image-caption-api
```

### 2. Install Dependencies:
Install the required dependencies using pip:
```bash
pip install -r requirements.txt
```

### 3. Run the Ollama Server:
To use the `Ollama` model, start the Ollama server on port 11434 by running the following command:
```bash
ollama run llava-llama3
ollama serve
```
This will start the Ollama server with the `llava-llama3` model and make it available on port `11434`.

### 4. Run the FastAPI Server:
Start the FastAPI server with Uvicorn:
```bash
uvicorn app:app --host 0.0.0.0 --port 8000
uvicorn app:app --host 0.0.0.0 --port 8000 --timeout-keep-alive 0 # in case CPU instead of GPU so long processing times
```
This will start the API locally on `http://localhost:8000`.

### 5. Expose the API Publicly Using ngrok:
To make the API accessible from the internet, use ngrok:
```bash
ngrok http 8000
```
This will generate a public URL that you can use to interact with the API, such as `https://.ngrok-free.app`.

### 6. Test the API:
You can test the API using tools like **curl**, **Postman**, or any frontend by sending a POST request to the `/analyze-image` endpoint with an image.

### Example curl Command:
```bash
curl -X POST -F "file=@path/to/images/test.jpg" https://.ngrok-free.app/analyze-image
```

### 7. View the Output:
The response will include:
**Image Description**: A high-level summary of the image.
**Music Genre**: The most fitting music genre based on the image.
**Privacy Information**: Any potential privacy risks or concerns detected in the image.

## Contributing:
Feel free to fork the repo, make changes, and submit pull requests for improvements!

## License:
This project is licensed under the GPL License.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/aditya-ranjan1234/imagecaption-api

Awesome Lists containing this project

README