An open API service indexing awesome lists of open source software.

https://github.com/qdrant/webinar-cloud-inference

How to Build a Multimodal Search Stack with One API
https://github.com/qdrant/webinar-cloud-inference

Last synced: 7 months ago
JSON representation

How to Build a Multimodal Search Stack with One API

Awesome Lists containing this project

README

          

---
title: "Qdrant Webinar: Cloud Inference"
emoji: 🖼️
colorFrom: blue
colorTo: yellow
sdk: docker
pinned: false
license: apache-2.0
short_description: "Personal Image Catalog"
---

# webinar-cloud-inference

This repository contains materials from the hands-on webinar "[How to Build a Multimodal Search Stack with One
API](https://www.youtube.com/watch?v=A8BBdGC2xKs)". It implements a personal image catalog that can be searched by image
or text, with additional object detection capabilities.

## Software Stack

The project is built using Python and JavaScript, integrating with external services through their APIs. **The
multimodal search capabilities are implemented using [Qdrant Cloud Inference](https://cloud.qdrant.io/), with
[FastAPI](https://fastapi.tiangolo.com/) serving as the backend API and a modern frontend built with
[Vite](https://vitejs.dev/) and [DaisyUI](https://daisyui.com/).**

## Prerequisites

**This project requires [Qdrant Cloud](https://cloud.qdrant.io/) as it uses Cloud Inference features that are not available in local Qdrant
instances.**

You'll need to sign up for a free account on [Qdrant Cloud](https://cloud.qdrant.io/) to get:
- A cluster URL to connect to your instance
- An API key for authentication
- Access to Cloud Inference capabilities

You'll need Python 3.10 or higher installed, and we recommend using [uv](https://docs.astral.sh/uv/) for dependency management. For the
frontend, you'll need Node.js 18 or higher.

Since we'll be working with cloud inference, no GPU access is required. Install all necessary libraries with a single
command:

```shell
# Backend dependencies
cd backend-app
uv sync

# Frontend dependencies
cd ../frontend-app
npm install
```

The project uses different models for different tasks, specifically:
- **YOLO** for object detection
- **CLIP** for vision-language embeddings
- **Qdrant Cloud Inference** for vector operations

**While you can swap these models with alternatives in the code, we cannot guarantee the system will maintain the same
performance level.**

### Configuration

Create a `.env` file in the `backend-app` directory with the following entries:

```dotenv
# Qdrant configuration
QDRANT_URL=your_qdrant_cluster_url
QDRANT_API_KEY=your_qdrant_api_key
QDRANT_COLLECTION_NAME=your_collection_name

# YOLO Model Configuration (optional)
YOLO_MODEL=yolo11s.pt

# Feature Flags (optional)
ENABLE_INGEST=true
```

You can rename `.env.example` to `.env` and fill in the values, or set these as environment variables.

### Feature Flags

The application supports feature flags to control functionality:

- **ENABLE_INGEST**: Controls whether the ingest functionality is available. When set to `false`, both the frontend and backend will hide the ingest features. Defaults to `true`.

#### Qdrant Cloud Setup

Since this project uses Cloud Inference features, you must use Qdrant Cloud. After signing up:

1. Create a new cluster in your Qdrant Cloud dashboard
2. Note your cluster URL and API key
3. Use these credentials in your `.env` file

The Cloud Inference feature allows creating vectors from raw data by sending it directly to the Qdrant cloud instance,
eliminating the need for local model inference.

## Usage

### Development Setup

For development, you can run the services separately:

#### Backend Setup

1. **Navigate to backend directory**
```bash
cd backend-app
```

2. **Install dependencies**
```bash
uv sync
```

3. **Configure environment variables**
Create a `.env` file in the `backend-app` directory with your Qdrant credentials.

4. **Setup Qdrant collection**
```bash
uv run python setup.py
```

5. **Start the backend server**
```bash
# For development with uvicorn
uv run uvicorn main:app --reload --host 0.0.0.0 --port 7860

# For production with gunicorn
uv run gunicorn main:app -c gunicorn.conf.py
```

#### Frontend Setup

1. **Navigate to frontend directory**
```bash
cd frontend-app
```

2. **Install dependencies**
```bash
npm install
```

3. **Start the development server**
```bash
npm run dev
```

### Production Deployment

For production, use the unified Docker container (see Docker Deployment section above).

## API Documentation

Once the backend is running, visit:
- **Interactive API docs**: http://localhost:7860/docs
- **ReDoc documentation**: http://localhost:7860/redoc

## Usage Examples

### Ingesting Images

**Via API:**
```bash
curl -X POST "http://localhost:7860/api/v1/ingest" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com/image.jpg"}'
```

### Searching Images

**Via API:**

```bash
curl "http://localhost:7860/api/v1/search?query=cat%20sitting%20on%20a%20chair&limit=5"
```

**Via Frontend:**

Open http://localhost:7860 and use the web interface.

## Docker Deployment

The project uses a unified Docker container that combines both the backend API and frontend application. This simplifies
deployment and reduces resource usage.

### Quick Start with Docker

1. **Build the unified container:**
```bash
# Using the build script
./build.sh

# Or manually
docker build -t webinar-app .
```

2. **Run the application:**
```bash
# With environment variables
docker run -p 7860:7860 --env-file .env webinar-app

# Or with inline environment variables
docker run -p 7860:7860 \
-e QDRANT_URL=your_qdrant_cluster_url \
-e QDRANT_API_KEY=your_qdrant_api_key \
-e QDRANT_COLLECTION_NAME=your_collection_name \
webinar-app
```

3. **Access the application:**
- **Frontend**: http://localhost:7860
- **API Documentation**: http://localhost:7860/docs
- **Health Check**: http://localhost:7860/health

### Environment Variables

Create a `.env` file in the root directory with your Qdrant credentials:

```dotenv
QDRANT_URL=your_qdrant_cluster_url
QDRANT_API_KEY=your_qdrant_api_key
QDRANT_COLLECTION_NAME=your_collection_name
```

For a detailed explanation of the system and its components, refer to the webinar recording and the source code in this
repository.