https://github.com/paganini2008/fastrag

A full-stack, multi-tenant RAG platform that turns documents, URLs, and FAQs into a searchable knowledge base with AI-powered question answering. Built with Django + React, powered by OpenAI embeddings and Claude / GPT-4o LLMs.
https://github.com/paganini2008/fastrag
llm minio postgresql qdrant rag
Last synced: 14 days ago
JSON representation
Host: GitHub
URL: https://github.com/paganini2008/fastrag
Owner: paganini2008
Created: 2026-03-20T06:40:01.000Z (4 months ago)
Default Branch: main
Last Pushed: 2026-03-20T09:17:46.000Z (4 months ago)
Last Synced: 2026-03-21T02:02:44.424Z (3 months ago)
Topics: llm, minio, postgresql, qdrant, rag
Language: Python
Homepage:
Size: 292 KB
Stars: 1
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
Awesome Lists containing this project

README

          


# 🧠 RAG Platform

**Production-grade Multi-tenant Retrieval-Augmented Generation Platform**

[![Python](https://img.shields.io/badge/Python-3.13-3776AB?style=flat-square&logo=python&logoColor=white)](https://python.org)

[![Django](https://img.shields.io/badge/Django-6.x-092E20?style=flat-square&logo=django&logoColor=white)](https://djangoproject.com)

[![React](https://img.shields.io/badge/React-19-61DAFB?style=flat-square&logo=react&logoColor=black)](https://react.dev)

[![TypeScript](https://img.shields.io/badge/TypeScript-5.9-3178C6?style=flat-square&logo=typescript&logoColor=white)](https://typescriptlang.org)

[![Tailwind CSS](https://img.shields.io/badge/Tailwind-4.x-06B6D4?style=flat-square&logo=tailwindcss&logoColor=white)](https://tailwindcss.com)

[![Qdrant](https://img.shields.io/badge/Qdrant-1.17-FF4785?style=flat-square)](https://qdrant.tech)

[![Tests](https://img.shields.io/badge/Tests-27%20passed-22C55E?style=flat-square&logo=pytest&logoColor=white)]()

[![License](https://img.shields.io/badge/License-MIT-8B5CF6?style=flat-square)](LICENSE)

A full-stack, multi-tenant RAG platform that turns documents, URLs, and FAQs into a searchable knowledge base with AI-powered question answering. Built with Django + React, powered by OpenAI embeddings and Claude / GPT-4o LLMs.

[Features](#-features) · [Architecture](#-architecture) · [Tech Stack](#-tech-stack) · [Quick Start](#-quick-start) · [API](#-api-overview) · [Testing](#-testing)



---

## ✨ Features

**Knowledge Management**

- 📁 Upload PDF, DOCX, XLSX, PPTX, TXT, HTML, Markdown

- 🌐 Import web pages (static + Playwright dynamic rendering)

- ❓ FAQ management with bulk import

- 🔄 Automatic async ingestion pipeline

- ♻️ Per-KB reindex with embedding model switching + live progress

**AI-Powered Retrieval**

- 🔍 Vector similarity search (cosine, top-K)

- 🤖 Full RAG answers via Claude or GPT-4o

- 📊 Score-based relevance ranking

- 📝 Prompt builder with token estimation

**Multi-tenant & Secure**

- 🏢 Full tenant isolation at DB + Qdrant level

- 🔑 JWT authentication + API Key for agents

- 👥 Role-based access (owner / admin / member)

- 📋 Audit logs for every search & answer

**Developer Experience**

- 📖 Auto-generated Swagger / OpenAPI 3.0 docs

- ⚡ Celery async processing with Redis

- 🧪 27 pytest tests, all mocked externals

- 🐳 Docker-ready, `uv` package manager

---

## 🏗 Architecture

### System Overview

```mermaid

graph TB

    subgraph Client["🖥️ Browser (React 19 + Vite)"]

        UI[React UI
Tailwind CSS + Ant Design]

        RTK[RTK Query
State & Cache]

        AUTH[JWT Auth
Redux Slice]

    end

    subgraph Gateway["🌐 API Gateway"]

        VITE_PROXY[Vite Dev Proxy
:5173 → :8000]

        NGINX[Nginx
Production Reverse Proxy]

    end

    subgraph Backend["⚙️ Django 6 + DRF"]

        direction TB

        URLS[URL Router
/api/v1/]

        MW[TenantMiddleware
JWT → tenant_id]

        subgraph Apps["Django Apps"]

            AUTH_APP[accounts
JWT + API Key Auth]

            KB[knowledge_bases
CRUD]

            DOCS[documents
Upload · URL · Chunks]

            FAQ_APP[faq
Q&A Management]

            RETRIEVAL[retrieval
Search · Prompt · Answer]

            AUDIT[audit
Retrieval & Query Logs]

        end

        subgraph Pipeline["Ingestion Pipeline"]

            direction LR

            PARSE[parsers
PDF·DOCX·XLSX·HTML]

            CHUNK[chunking
LlamaIndex Splitter]

            EMBED_SVC[embeddings
OpenAI API]

            VSTORE[vector_store
Qdrant Client]

        end

    end

    subgraph Workers["🔄 Celery Workers"]

        INGEST[ingest_document
parse→chunk→embed→index]

        INGEST_URL[ingest_url
fetch→parse→chunk→embed]

        EMBED_FAQ[embed_faq_item
question+answer→embed]

    end

    subgraph Storage["💾 Data Layer"]

        PG[(PostgreSQL
schema: rag)]

        QDRANT[(Qdrant
document_chunks)]

        MINIO[(MinIO
raw files)]

        REDIS[(Redis
Celery broker)]

    end

    subgraph LLM["🤖 AI Services"]

        OPENAI[OpenAI
text-embedding-3-small
GPT-4o / GPT-4o-mini]

        ANTHROPIC[Anthropic
Claude Sonnet 4.6
Claude Haiku 4.5]

    end

    UI --> RTK --> VITE_PROXY --> URLS

    URLS --> MW --> Apps

    AUTH_APP --> PG

    KB --> PG

    DOCS --> PG

    DOCS --> MINIO

    DOCS --> REDIS

    FAQ_APP --> PG

    FAQ_APP --> REDIS

    RETRIEVAL --> QDRANT

    RETRIEVAL --> LLM

    AUDIT --> PG

    REDIS --> Workers

    Workers --> PARSE --> CHUNK --> EMBED_SVC --> VSTORE

    EMBED_SVC --> OPENAI

    VSTORE --> QDRANT

    Workers --> MINIO

```

---

### RAG Ingestion Pipeline

```mermaid

flowchart LR

    A([📄 File Upload\nor URL]) --> B

    subgraph B["① Parse"]

        B1[PDF → pypdf\nDOCX → python-docx\nXLSX → openpyxl\nHTML → BeautifulSoup4]

    end

    B --> C

    subgraph C["② Chunk"]

        C1[LlamaIndex\nSentenceSplitter\nchunk_size=512 tokens\noverlap=64 tokens]

    end

    C --> D

    subgraph D["③ Embed"]

        D1[OpenAI\ntext-embedding-3-small\nor 3-large per KB]

    end

    D --> E

    subgraph E["④ Index"]

        E1[Qdrant upsert\nper-KB collection\nor shared collection]

    end

    E --> F([✅ indexed\nDocument.status])

    style A fill:#4f46e5,color:#fff

    style F fill:#10b981,color:#fff

```

---

### RAG Query Flow

```mermaid

sequenceDiagram

    actor User

    participant FE as React Frontend

    participant API as Django API

    participant QD as Qdrant

    participant LLM as Claude / GPT-4o

    participant DB as PostgreSQL

    User->>FE: Ask a question

    FE->>API: POST /api/v1/rag/answer/\n{query, knowledge_base_id, llm_model}

    API->>API: Embed query\n(OpenAI text-embedding-3-small)

    API->>QD: query_points(vector, filter={tenant_id, kb_id}, top_k=5)

    QD-->>API: Top-K chunks with scores

    API->>API: Build prompt\n(system + context + question)

    API->>LLM: Chat completion request

    LLM-->>API: Generated answer

    API->>DB: Save QueryLog\n(query, answer, tokens, latency_ms)

    API-->>FE: {answer, sources, usage, latency_ms}

    FE-->>User: Display answer + cited sources

```

---

### Multi-tenant Data Isolation

```mermaid

erDiagram

    TENANT {

        uuid id PK

        string name

        string slug

        string plan

    }

    USER {

        uuid id PK

        uuid tenant_id FK

        string email

        string role

    }

    KNOWLEDGE_BASE {

        uuid id PK

        uuid tenant_id FK

        string name

        int chunk_size

        int retrieval_top_k

    }

    DOCUMENT {

        uuid id PK

        uuid tenant_id FK

        uuid knowledge_base_id FK

        string status

        string file_path

    }

    DOCUMENT_CHUNK {

        uuid id PK

        uuid tenant_id FK

        uuid document_id FK

        text text

        int chunk_index

        bool is_embedded

    }

    FAQ_ITEM {

        uuid id PK

        uuid tenant_id FK

        uuid knowledge_base_id FK

        text question

        text answer

        bool is_embedded

    }

    TENANT ||--o{ USER : has

    TENANT ||--o{ KNOWLEDGE_BASE : owns

    KNOWLEDGE_BASE ||--o{ DOCUMENT : contains

    KNOWLEDGE_BASE ||--o{ FAQ_ITEM : contains

    DOCUMENT ||--o{ DOCUMENT_CHUNK : split_into

```

---

## 🛠 Tech Stack

### Backend

| Layer | Technology | Purpose |

|-------|-----------|---------|

| Runtime | Python 3.13 + **uv** | Fast dependency management |

| Framework | **Django 6** + **DRF 3.16** | Web framework + REST APIs |

| Auth | **simplejwt** + API Key | JWT tokens + agent access |

| Task Queue | **Celery 5** + **Redis** | Async ingestion pipeline |

| Vector DB | **Qdrant 1.17** | Cosine similarity search |

| Object Storage | **MinIO** | Raw file storage (S3-compatible) |

| Database | **PostgreSQL** (schema: `rag`) | Structured data |

| Chunking | **LlamaIndex** `SentenceSplitter` / `SemanticSplitter` | Token-aware text splitting |

| Embedding | **OpenAI** `text-embedding-3-small` / `3-large` | Per-KB configurable, 1536 / 3072-dim |

| LLM | **Claude Sonnet 4.6** / **GPT-4o** | Answer generation |

| API Docs | **drf-spectacular** | OpenAPI 3.0 / Swagger |

| Document Parsing | pypdf · python-docx · openpyxl · python-pptx · BeautifulSoup4 | Multi-format support |

### Frontend

| Technology | Purpose |

|-----------|---------|

| **React 19** + **TypeScript 5.9** | UI framework |

| **Vite 7** | Build tool + dev proxy |

| **Redux Toolkit** + **RTK Query** | State management + data fetching |

| **React Router 7** | Client-side routing |

| **Ant Design 6** | UI components (tables, modals, forms) |

| **Tailwind CSS 4** | Dark theme + layout + utilities |

---

## 🚀 Quick Start

### Prerequisites

```bash

# Start infrastructure services

docker run -d -p 6333:6333 qdrant/qdrant

docker run -d -p 19000:9000 -e MINIO_ROOT_USER=admin -e MINIO_ROOT_PASSWORD=admin123 \

  minio/minio server /data --console-address ":9001"

redis-server --requirepass yourpassword

```

### Backend

```bash

cd backend

# Install dependencies (uv auto-creates virtualenv)

uv sync

# Configure environment

cp .env.example .env

# Edit .env: set DB credentials, OPENAI_KEY, ANTHROPIC_API_KEY, REDIS_URL

# Database setup

uv run python src/manage.py migrate

uv run python src/manage.py init_qdrant   # Create Qdrant collection

uv run python src/manage.py createsuperuser

# Run server

uv run python src/manage.py runserver     # http://localhost:8000

# Run Celery worker (separate terminal)

cd src && uv run celery -A config.celery worker --loglevel=info

```

### Frontend

```bash

cd frontend

npm install

npm run dev     # http://localhost:5173

```

> **Dev proxy**: All `/api/*` requests from `:5173` are automatically forwarded to `:8000` by Vite — no CORS config needed.

---

## 📁 Project Structure

```

rag/

├── backend/

│   ├── .env                      # All configuration

│   ├── pyproject.toml            # uv dependencies

│   └── src/

│       ├── config/

│       │   ├── settings/         # base / development / production

│       │   ├── api_router.py     # /api/v1/ route registration

│       │   └── celery.py         # Celery app

│       ├── apps/

│       │   ├── common/           # Base models, MinIO client, pagination

│       │   ├── tenants/          # Tenant model + TenantMiddleware

│       │   ├── accounts/         # User auth (JWT + API Key)

│       │   ├── knowledge_bases/  # KB CRUD

│       │   ├── documents/        # Upload, URL import, chunk preview

│       │   ├── faq/              # FAQ management + bulk import

│       │   ├── ingestion/        # Celery task orchestration

│       │   ├── parsers/          # PDF/DOCX/XLSX/PPTX/HTML parsers

│       │   ├── chunking/         # LlamaIndex SentenceSplitter / SemanticSplitter

│       │   ├── embeddings/       # OpenAI embedding service

│       │   ├── vector_store/     # Qdrant client wrapper

│       │   ├── retrieval/        # Search + prompt builder + RAG answer

│       │   └── audit/            # Search & query logs

│       └── tests/                # 27 pytest tests

│

├── frontend/

│   ├── .env                      # VITE_API_URL, VITE_APP_NAME

│   └── src/

│       ├── components/Layout/    # Collapsible sidebar + sticky header

│       ├── pages/                # Login, Dashboard, KB, Docs, FAQ,

│       │                         # Retrieval, Jobs, Logs

│       ├── store/

│       │   ├── api/              # RTK Query endpoints (4 APIs)

│       │   └── slices/authSlice  # JWT token management

│       └── index.css             # Tailwind v4 + Ant Design dark theme

│

└── docs/

    └── README.md                 # Full technical documentation (CN)

```

---

## 🔌 API Overview

All endpoints are prefixed with `/api/v1/`. Interactive docs at **`/api/schema/swagger-ui/`**.

### Authentication

```http

POST /auth/login/          # Returns access + refresh JWT

POST /auth/token/refresh/  # Refresh access token

GET  /auth/me/             # Current user info

```

### Tenant Settings

```http

GET   /tenants/settings/  # Get tenant-level defaults (embedding_model, llm_model)

PATCH /tenants/settings/  # Update tenant-level defaults

```

### Knowledge Bases

```http

GET    /knowledge-bases/            # List (paginated)

POST   /knowledge-bases/            # Create

PATCH  /knowledge-bases/{id}/       # Update settings

DELETE /knowledge-bases/{id}/       # Delete

POST   /knowledge-bases/{id}/rebuild/  # Async reindex with new embedding model

```

### Documents

```http

POST /knowledge-bases/{kbId}/documents/upload/       # Upload file (multipart)

POST /knowledge-bases/{kbId}/documents/import-url/   # Import URL

GET  /knowledge-bases/{kbId}/documents/{id}/chunks/  # Preview chunks

POST /knowledge-bases/{kbId}/documents/{id}/reindex/ # Re-trigger pipeline

```

### FAQ

```http

GET    /knowledge-bases/{kbId}/faq/               # List FAQ items

POST   /knowledge-bases/{kbId}/faq/               # Create (auto-embeds)

POST   /knowledge-bases/{kbId}/faq/bulk-import/   # Bulk import

PATCH  /knowledge-bases/{kbId}/faq/{id}/          # Update

DELETE /knowledge-bases/{kbId}/faq/{id}/          # Delete

```

### Retrieval & RAG

```http

POST /retrieval/search/  # Vector search — returns top-K chunks with scores

POST /rag/prompt/        # Build RAG prompt (no LLM call)

POST /rag/answer/        # Full RAG: retrieve + LLM → answer + sources

```

**Search request:**

```json

{

  "query": "What is RAG?",

  "knowledge_base_id": "uuid",

  "top_k": 5,

  "score_threshold": 0.0

}

```

**Answer request:**

```json

{

  "query": "What is RAG?",

  "knowledge_base_id": "uuid",

  "top_k": 5,

  "llm_model": "claude-sonnet-4-6"

}

```

**Answer response:**

```json

{

  "answer": "RAG (Retrieval-Augmented Generation) is...",

  "sources": [

    { "source": "intro.pdf · page 3", "score": 0.921 }

  ],

  "usage": { "prompt_tokens": 1240, "completion_tokens": 187, "total_tokens": 1427 },

  "latency_ms": 1267

}

```

---

## ⚙️ Configuration

### Backend `.env`

```ini

# Django

DJANGO_SETTINGS_MODULE=config.settings.development

SECRET_KEY=your-secret-key

# PostgreSQL

DB_HOST=localhost

DB_PORT=5432

DB_DATABASE=demo

DB_USER=postgres

DB_PASSWORD=yourpassword

DB_SCHEMA=rag

# MinIO

MINIO_URL=localhost:19000

MINIO_USER=admin

MINIO_PASSWORD=admin123

MINIO_BUCKET=rag-documents

# Qdrant

VDB_HOST=localhost

VDB_PORT=6333

QDRANT_COLLECTION=document_chunks

# Redis / Celery

REDIS_URL=redis://:yourpassword@localhost:6379/0

# AI Keys

OPENAI_KEY=sk-proj-...

ANTHROPIC_API_KEY=sk-ant-...

DEFAULT_LLM_MODEL=claude-sonnet-4-6

```

### Frontend `.env`

```ini

VITE_API_URL=http://localhost:8000

VITE_APP_NAME=RAG Platform

```

---

## 🧪 Testing

```bash

cd backend

# Run all 27 tests

uv run pytest

# With reuse-db (faster on re-runs)

uv run pytest --reuse-db

# Specific file

uv run pytest src/tests/test_retrieval.py -v

# With coverage

uv run pytest --cov=apps --cov-report=html

```

**Test strategy:**

- External services (MinIO, Qdrant, LLMs, background tasks) are all **mocked** — tests run without any infrastructure

- Real PostgreSQL is used with a `test_` prefixed database

- `conftest.py` auto-creates the `rag` schema if missing

```

✓ test_auth.py            — login, token refresh, protected endpoints

✓ test_knowledge_bases.py — CRUD, tenant isolation

✓ test_documents.py       — upload, URL import, chunk preview

✓ test_faq.py             — create, list, delete, bulk import

✓ test_retrieval.py       — vector search, RAG answer, error handling

27 passed in 4.5s

```

---

## 🐳 Docker

```bash

# Build images

docker build -t rag-backend  ./backend

docker build -t rag-frontend ./frontend

# Run backend (point to your infra)

docker run -d --env-file backend/.env -p 8000:8000 rag-backend

# Run frontend

docker run -d -p 80:80 rag-frontend

```

---

## 📖 Pages

| Page | Route | Description |

|------|-------|-------------|

| **Login** | `/login` | Email + password, JWT stored in localStorage |

| **Dashboard** | `/dashboard` | Stats overview, KB summary, quick actions |

| **Knowledge Bases** | `/knowledge-bases` | Create/edit KBs, configure chunk size & top-K, reindex with model switching |

| **Documents** | `/knowledge-bases/:id/documents` | Upload files, import URLs, preview chunks |

| **FAQ** | `/knowledge-bases/:id/faq` | Manage Q&A pairs, view embedding status |

| **Retrieval Test** | `/retrieval` | Test vector search and RAG answers interactively |

| **Jobs** | `/jobs` | Monitor ingestion pipeline progress per document |

| **Logs** | `/logs` | Retrieval logs and RAG query logs with latency |

---

## 📄 License

MIT License — see [LICENSE](LICENSE) for details.

---



Built with Django · React · Qdrant · OpenAI · Anthropic · Tailwind CSS
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/paganini2008/fastrag

Awesome Lists containing this project

README