https://github.com/gommezen/rag_writing_assistant

A transparent, fully local RAG document assistant that grounds every claim in your documents with interactive citations, coverage tracking, and confidence indicators.
https://github.com/gommezen/rag_writing_assistant

claude document-analysis document-analysis-tool faiss fastapi local-llm ollama rag react typescript

Last synced: about 22 hours ago
JSON representation

A transparent, fully local RAG document assistant that grounds every claim in your documents with interactive citations, coverage tracking, and confidence indicators.

Host: GitHub
URL: https://github.com/gommezen/rag_writing_assistant
Owner: gommezen
Created: 2026-01-22T15:58:53.000Z (about 1 month ago)
Default Branch: main
Last Pushed: 2026-02-10T02:44:10.000Z (18 days ago)
Last Synced: 2026-02-10T05:54:04.418Z (18 days ago)
Topics: claude, document-analysis, document-analysis-tool, faiss, fastapi, local-llm, ollama, rag, react, typescript
Language: Python
Homepage:
Size: 925 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md

Awesome Lists containing this project

README

          # RAG Document Intelligence

![Python](https://img.shields.io/badge/Python-3.11+-blue)

![React](https://img.shields.io/badge/React-18-61DAFB)

![TypeScript](https://img.shields.io/badge/TypeScript-strict-3178C6)

![FastAPI](https://img.shields.io/badge/FastAPI-0.109+-009688)

![License](https://img.shields.io/badge/License-MIT-green)

![Ollama](https://img.shields.io/badge/LLM-Ollama_(local)-black)

A transparent AI writing assistant that grounds every claim in your documents. Hover any citation to preview the source, click to verify the full context — so you always know exactly where the AI got its information.

![RAG Document Intelligence Screenshot](docs/screenshot.png)

## Features

- **Document upload** - PDF, DOCX, and TXT with drag-and-drop (non-blocking processing)

- **Grounded generation** - All AI content derived from your uploaded documents

- **Interactive citations** - Hover to preview source, click to navigate to full context

- **Confidence indicators** - Visual cues for high/medium/low confidence content

- **Section-level editing** - Regenerate or manually edit individual sections

- **Coverage transparency** - See what % of your documents were analyzed

- **Intent detection** - Auto-selects retrieval strategy per query type (analysis/QA/writing)

- **Chat mode** - Multi-turn conversations with persistent history

- **Dark mode** - Toggle between light and dark themes

- **Keyboard shortcuts** - Ctrl+Enter to generate, Escape to close modals

- **Fully local** - Documents never leave your machine (runs on Ollama)

## Requirements

- Python 3.11+

- Node.js 18+

- [Ollama](https://ollama.ai/) running locally

## Installation

### 1. Install Ollama and pull models

```bash

ollama pull qwen2.5:7b-instruct-q4_0

ollama pull mxbai-embed-large

```

### 2. Set up the backend

```bash

cd backend

pip install -r requirements.txt

```

### 3. Set up the frontend

```bash

cd frontend

npm install

```

## Running the App

Start both services in separate terminals:

```bash

# Terminal 1 - Backend

cd backend

python -m uvicorn app.main:app --port 8001 --reload

# Terminal 2 - Frontend

cd frontend

npm run dev

```

Open http://localhost:5173 in your browser.

## Configuration

Create `backend/.env` to customize (all optional):

```env

# LLM Models (must be available in Ollama)

GENERATION_MODEL=qwen2.5:7b-instruct-q4_0

EMBEDDING_MODEL=mxbai-embed-large

# Intent-specific models (falls back to GENERATION_MODEL)

ANALYSIS_MODEL=llama3.1:8b-instruct-q8_0

WRITING_MODEL=qwen2.5:7b-instruct-q4_0

QA_MODEL=gemma3:4b

# Retrieval settings

SIMILARITY_THRESHOLD=0.35

TOP_K=10

# Coverage settings (for analysis/summary mode)

DEFAULT_COVERAGE_PCT=35

MAX_COVERAGE_PCT=60

# Ollama connection

OLLAMA_BASE_URL=http://localhost:11434

```

## How It Works

### Intent Detection

The system classifies queries and adjusts retrieval strategy automatically:

| Query Example | Intent | Retrieval | Coverage |

|---|---|---|---|

| "Summarize this document" | Analysis | Diverse (regions) | ~35% |

| "What is data feminism?" | Q&A | Similarity (top-k) | ~8-10% |

| "Write a report on X" | Writing | Similarity (top-k) | ~8-10% |

### Coverage Tracking

The UI shows how much of your documents the system actually read:

- Coverage percentage with color indicator (green/yellow/red)

- Chunks analyzed vs total

- "Expand to ~50%" button for deeper analysis

A summary based on 8% is different from 35% — the system makes this visible.

### Confidence Levels

| Level | Criteria |

|---|---|

| High | 3+ citations |

| Medium | 1-2 citations |

| Low | Hedging language detected |

| Unknown | 0 citations |

### Blind Spot Detection

The system reports what it didn't see — documents with no coverage, regions (intro/middle/conclusion) that weren't sampled.

### No Learning From Your Data

RAG does **not** train on your documents. Files are chunked and indexed for retrieval only — the AI model is never modified. Deleting a document removes it completely.

## Chat Mode

Switch to Chat mode for multi-turn conversations about your documents:

- Follow-up questions that build on prior context

- Conversation history in the sidebar (browse, resume, delete)

- Per-message sources showing which chunks were used

- Cumulative coverage tracking across a conversation

- Stored locally as JSON files, never sent externally

## Project Structure

```

rag_writing_assistant/

├── backend/

│   ├── app/

│   │   ├── api/routes/          # FastAPI endpoints

│   │   ├── core/                # Exceptions, logging

│   │   ├── models/              # Pydantic models

│   │   ├── rag/                 # Chunking, embeddings, vector store, prompts

│   │   ├── services/            # Business logic (generation, retrieval, intent)

│   │   ├── config.py            # Settings

│   │   └── main.py              # FastAPI app entry point

│   ├── tests/                   # Pytest test suite

│   ├── data/                    # Document storage, vector indices, conversations

│   ├── requirements.txt

│   └── pyproject.toml

├── frontend/

│   ├── src/

│   │   ├── api/                 # API client

│   │   ├── components/          # React components

│   │   ├── hooks/               # React Query hooks

│   │   ├── types/               # TypeScript interfaces

│   │   └── test/                # Test utilities

│   ├── package.json

│   ├── tsconfig.json

│   └── vite.config.ts

├── docs/                        # Screenshots

├── CLAUDE.md                    # AI agent instructions

├── CHANGELOG.md

└── README.md

```

## API Reference

| Endpoint | Method | Description |

|---|---|---|

| `/api/documents` | POST | Upload document |

| `/api/documents` | GET | List documents |

| `/api/documents/{id}` | GET | Get document status |

| `/api/documents/{id}` | DELETE | Delete document |

| `/api/generate` | POST | Generate draft |

| `/api/generate/section` | POST | Regenerate section |

| `/api/generate/suggestions` | POST | Generate suggested questions |

| `/api/chat` | POST | Send chat message |

| `/api/chat` | GET | List conversations |

| `/api/chat/{id}` | GET | Get conversation |

| `/api/chat/{id}` | DELETE | Delete conversation |

| `/api/chat/{id}` | PATCH | Update conversation title |

| `/api/health` | GET | Health check |

Document uploads are non-blocking — the endpoint returns immediately with `status: "pending"`, then progresses through `processing` to `ready` (or `failed`).

## Development

### Running Tests

```bash

# Backend (pytest)

cd backend && pytest tests/ -v

# Frontend (vitest)

cd frontend && npm run test

# Frontend type check + build

cd frontend && npm run build

```

### Tech Stack

| Layer | Technology |

|---|---|

| Backend | Python 3.11+, FastAPI, Pydantic |

| Frontend | React 18, TypeScript (strict), React Query |

| Vector DB | FAISS (local, file-based) |

| LLM | Ollama (local inference) |

| Embeddings | mxbai-embed-large |

## License

MIT

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/gommezen/rag_writing_assistant

Awesome Lists containing this project

README