An open API service indexing awesome lists of open source software.

https://github.com/rooshikeshbhatt/item-inspector-ai

AI-based product condition detection using BLIP-2 + FastAPI + Phi-4 (Ollama)
https://github.com/rooshikeshbhatt/item-inspector-ai

ai blip2 computer-vision condition-scoring ecommerce-ai fastapi hugging image-analysis image-tagging multimodal-ai natural-language-generation ollama open-source phi4 product-inspection prompt-engineering pyotrch python visual-language-models zero-shot-learning

Last synced: about 1 month ago
JSON representation

AI-based product condition detection using BLIP-2 + FastAPI + Phi-4 (Ollama)

Awesome Lists containing this project

README

          

# πŸ•΅οΈβ€β™‚οΈ Item-Inspector AI

![Python](https://img.shields.io/badge/Python-3.10+-blue?logo=python)
![FastAPI](https://img.shields.io/badge/FastAPI-πŸš€-brightgreen?logo=fastapi)
![PyTorch](https://img.shields.io/badge/PyTorch-Used-red?logo=pytorch)
![Transformers](https://img.shields.io/badge/HuggingFace-Transformers-yellow?logo=huggingface)
![BLIP2](https://img.shields.io/badge/BLIP--2-Salesforce-purple)
![Ollama](https://img.shields.io/badge/Ollama-LLM%20Runtime-lightgrey?logo=ollama)
![License](https://img.shields.io/badge/License-MIT-green)
![AI Game Included](https://img.shields.io/badge/Easter_Egg%3A-AI_Tic_Tac_Toe-ff69b4)

**Visual AI for Product Condition Assessment & Human-like Reporting**

> Upload product images, let BLIP-2 understand the item, generate human-like condition reports with Phi-4, and enjoy the magic of zero-shot image-to-text reasoning. Also… there's a secret mini-game.

## πŸ“š Table of Contents
- [Features](#-features)
- [Banner](#-banner)
- [Project Structure](#-project-structure)
- [Installation Guide](#-installation-guide)
- [Hardware & GPU Setup](#-hardware--gpu-setup)
- [Bonus: Tic-Tac-Toe AI](#-bonus-tic-tac-toe-ai)
- [Technology Stack](#-technology-stack)
- [GitHub Topics](#-github-topics)
- [License](#-license)
- [Contact](#-contact)

---

## ✨ Features

- **AI Product Recognition** – Detects object type: Watch, Shoe, Phone, etc.
- **Material Identification** – Metal, Leather, Glass, Suede? We got it.
- **Visual Condition Tags** – Custom per-item labels (like β€œscratched glass” or β€œtorn strap”).
- **Score Calculation** – Evaluates product damage level and assigns a 4–10 score.
- **Natural Language Report** – Uses Phi-4 LLM to describe condition in ~50 human-like words.
- **Frontend Upload UI** – Drag, drop, analyze.

---

## πŸ“Έ Banner

![Banner](docs/banner.PNG)

---

## πŸ—‚ Project Structure

```text
Item-Inspector AI/
β”œβ”€β”€ backend/
β”‚ β”œβ”€β”€ app.py # This FastAPI file
β”‚ β”œβ”€β”€ requirements.txt
β”‚ β”œβ”€β”€ python_gpu_test.py # Check if TensorFlow, pytorch & numpy runs on GPU
β”œβ”€β”€ frontend/
β”‚ └── index.html # Web UI for uploading images
β”œβ”€β”€ sample_images/
β”‚ └── example_watch.jpg # Example test image
β”œβ”€β”€ just_for_fun/
β”‚ └── tic_tac_toe.py # Tic-Tac-Toe AI game
β”œβ”€β”€ README.md
```
---

## πŸ›  Installation Guide

### πŸ”— Prerequisites

- Python 3.10+ (recommended Python 3.10.11 for GPU usage on windows)
- GitHub Desktop or Git CLI
- Ollama installed & phi4(phi4:14b-q4_K_M) model downloaded

---

### πŸ“₯ 1. Clone the Repo

git clone https://github.com/Rooshikesh/Item-Inspector-AI.git
```text
cd Item-Inspector-AI/backend
```
---

### πŸ“¦ 2. Create Virtual Environment
```text
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
```
---

### πŸ“¦ 3. Install Dependencies

pip install -r requirements.txt

---

### 🧠 4. Start Ollama with Phi-4
```text
ollama run phi4:14b-q4_K_M
```
---

### πŸš€ 5. Launch FastAPI
```text
uvicorn app:app --reload
```
Go to: `http://127.0.0.1:8000/docs`

---

### 🌐 6. Use Web Interface (Optional)

Open frontend/index.html in your browser. Drag and drop product images.

---

## ⚑ Hardware & GPU Setup

If you're planning to run BLIP-2 on GPU for maximum performance, keep the following in mind:

### βœ… Hardware Requirements
- **NVIDIA GPU** with at least **8–12GB VRAM**
- Recommended: **RTX 3060 or higher**
- **CUDA-compatible drivers** installed
- Check GPU visibility with: `nvidia-smi`
- **Python**: Version **3.10+**

### βœ… Python Environment for GPU
- Install PyTorch with CUDA support:
```text
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
```
- Our code already includes:
```text
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
torch_dtype=torch.float16
```
This ensures your models run on GPU if available.

### βœ… BLIP-2 Optimization Settings
- Make sure BLIP-2 loads with:
```text
device_map="auto", torch_dtype=torch.float16
```
- Images are correctly converted to RGB before inference:
```text
img = Image.open(file.file).convert("RGB")
```
### πŸ§ͺ Verify GPU with Our Utility Script
Run the included [`python_gpu_test.py`](backend/python_gpu_test.py) file to confirm if TensorFlow, PyTorch, and NumPy are GPU-ready:
```text
cd backend
python python_gpu_test.py
```
This script will print the detected GPUs, framework versions, and whether each is using the GPU or CPU.

---

## πŸ€– Bonus: Tic-Tac-Toe AI

When you need a break from debugging and BLIP-2 hallucinations:
```text
cd just_for_fun
python tic_tac_toe.py
```
* Supports easy, medium, and hard mode
* Uses Minimax algorithm in Hard mode to destroy your confidence πŸ”₯

---

## πŸ’‘ Technology Stack

* BLIP-2 (Salesforce) - Vision Language
* Phi-4 (Ollama) - Language Generation
* FastAPI - Backend Framework
* HTML/JS - Minimal Frontend
* Hugging Face Transformers
* PyTorch

---

## 🏷️ GitHub Topics

ai, blip2, phi4, fastapi, transformers, computer-vision, image-classification,
product-inspection, natural-language-generation, multimodal-ai, semantic-analysis,
ecommerce-ai, repairtech, humanlike-ai, condition-scoring, pytorch, webapi,
backend, frontend, python

---

## πŸ“„ License

MIT β€” use it, share it, modify it. Just don’t forget to smile when it works.

---

## βœ‰οΈ Contact

**Rooshikesh Bhatt**
rooshikeshbhatt@gmail.com

---