https://github.com/gursv/langlens

LangLens is an LLM model based on Openai gpt-3.5 and Salesforce vqa base model. For now, it can caption, detect objects in the image (perfectly) and answer some basic questions related to the image.
https://github.com/gursv/langlens

ai captioning-images gpt-3 huggingface-models langchain-python llms object-detection openai openai-api python question-answering

Last synced: 3 months ago
JSON representation

LangLens is an LLM model based on Openai gpt-3.5 and Salesforce vqa base model. For now, it can caption, detect objects in the image (perfectly) and answer some basic questions related to the image.

Host: GitHub
URL: https://github.com/gursv/langlens
Owner: GURSV
Created: 2024-11-05T11:27:49.000Z (6 months ago)
Default Branch: main
Last Pushed: 2024-11-23T17:21:12.000Z (6 months ago)
Last Synced: 2025-01-25T15:39:19.359Z (4 months ago)
Topics: ai, captioning-images, gpt-3, huggingface-models, langchain-python, llms, object-detection, openai, openai-api, python, question-answering
Language: Python
Homepage:
Size: 5.72 MB
Stars: 1
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# LangLens ֎

LangLens is an AI-powered model combining OpenAI's GPT, Salesforce's Visual Question Answering (VQA) base model, Fine Tuned Salesforce's VQA model & Facebook's detr-resnet-50. It offers the ability to:
- Generate image captions.
- Detect objects in images with high accuracy.
- Answer basic image-related questions.

## Features
- **Image Captioning:** Provides detailed captions for uploaded images.
- **Object Detection:** Identifies objects within images effectively.
- **Question Answering:** Responds to queries about the content of an image.

## Installation

1. Clone the repository:
- git clone https://github.com/GURSV/LangLens.git
- cd LangLens

2. Install dependencies:
- pip install -r requirements.txt

## Usage
Run the fine-tuning script:
- python fine_tune_colab.py

Run the main script:
- python main.py

This will enable image processing and interactive question answering.

Additional utilities are provided in:
- tools.py - Supplementary tools for model interaction.

Dataset:
- The model uses a CSV dataset for fine-tuning. Ensure the dataset is formatted appropriately for training.

Images folder:
- images/ - For training the fine-tune model
- images-for-test/ - For testing the project

Do - streamlit run main.py for running the project locally (http://localhost:8501)

View and working of the application

![image](https://github.com/user-attachments/assets/4867d31b-9852-4495-b70f-9588b82675cd)

![image](https://github.com/user-attachments/assets/eb4a22d1-466c-4edd-9eaf-b5b2d6f0a540)

![image](https://github.com/user-attachments/assets/3812bb5c-aea3-4fca-b4e7-313fbaa8c20a)

![image](https://github.com/user-attachments/assets/40a1c150-2ab3-4f47-9515-b78129848050)

![image](https://github.com/user-attachments/assets/36d1ca7a-5956-46bb-8d3c-588deeed0c49)

etc...

Thank you.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/gursv/langlens

Awesome Lists containing this project

README