{"id":48950282,"url":"https://github.com/albertodiazdurana/rag-document-assistant","last_synced_at":"2026-04-17T19:33:31.239Z","repository":{"id":333170104,"uuid":"1136447118","full_name":"albertodiazdurana/rag-document-assistant","owner":"albertodiazdurana","description":"Production-ready RAG system with multi-provider LLM support (OpenAI, Claude, Ollama), vector database integration, FastAPI backend, and MLflow evaluation. Features German language support and Streamlit UI.","archived":false,"fork":false,"pushed_at":"2026-03-21T15:36:02.000Z","size":107,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-03-22T05:59:31.018Z","etag":null,"topics":["anthropic","chromadb","document-qa","embeddings","fastapi","langchain","langgraph","llm","mlflow","nlp","ollama","openai","pinecone","python","rag","retrieval-augmented-generation","streamlit","vector-database"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/albertodiazdurana.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-01-17T17:52:41.000Z","updated_at":"2026-03-21T15:36:03.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/albertodiazdurana/rag-document-assistant","commit_stats":null,"previous_names":["albertodiazdurana/rag-document-assistant"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/albertodiazdurana/rag-document-assistant","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/albertodiazdurana%2Frag-document-assistant","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/albertodiazdurana%2Frag-document-assistant/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/albertodiazdurana%2Frag-document-assistant/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/albertodiazdurana%2Frag-document-assistant/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/albertodiazdurana","download_url":"https://codeload.github.com/albertodiazdurana/rag-document-assistant/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/albertodiazdurana%2Frag-document-assistant/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31943477,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-17T17:29:20.459Z","status":"ssl_error","status_checked_at":"2026-04-17T17:28:47.801Z","response_time":62,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anthropic","chromadb","document-qa","embeddings","fastapi","langchain","langgraph","llm","mlflow","nlp","ollama","openai","pinecone","python","rag","retrieval-augmented-generation","streamlit","vector-database"],"created_at":"2026-04-17T19:33:30.491Z","updated_at":"2026-04-17T19:33:31.230Z","avatar_url":"https://github.com/albertodiazdurana.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# RAG Document Assistant\r\n\r\n[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)\r\n[![FastAPI](https://img.shields.io/badge/FastAPI-0.109+-green.svg)](https://fastapi.tiangolo.com/)\r\n[![LangChain](https://img.shields.io/badge/LangChain-0.1+-orange.svg)](https://langchain.com/)\r\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\r\n\r\nProduction-ready RAG system with multi-provider LLM support (OpenAI, Anthropic, Ollama), ChromaDB vector database, FastAPI backend, and MLflow evaluation.\r\n\r\n## What This Does\r\n\r\nImagine you have hundreds of documents; manuals, reports, contracts. Instead of reading through all of them to find an answer, you simply ask a question in plain language: *\"What are the warranty terms for product X?\"* or *\"Summarize the key findings from last quarter.\"*\r\n\r\nThis application reads your documents, understands their content, and answers your questions accurately; citing exactly where it found the information. You can choose which AI provider (OpenAI, Anthropic, or a local Ollama model) answers your questions.\r\n\r\n## Features\r\n\r\n### Implemented (Sprint 1 \u0026 2)\r\n\r\n- **Multi-Provider LLM Support**: OpenAI, Anthropic, Ollama (local models)\r\n- **Vector Database**: ChromaDB with OpenAI embeddings\r\n- **Document Processing**: PDF, Markdown, TXT with configurable chunking\r\n- **RAG Chain**: Retrieval-augmented generation with conversation memory\r\n- **FastAPI Backend**: REST API with async document processing\r\n- **MLflow Evaluation**: Experiment tracking with faithfulness/relevance metrics\r\n- **Streamlit UI**: Interactive document upload and chat interface with source citations\r\n\r\n### Planned (Sprint 3: Experiments \u0026 Optimization)\r\n\r\n- **German Language Support**: Multilingual embeddings (HuggingFace e5) and prompts\r\n- **Hybrid Search**: Semantic + keyword (BM25) retrieval with fusion\r\n- **Performance \u0026 Robustness**: Caching, token tracking, adversarial testing\r\n- **Docker Deployment**: Containerized production setup\r\n\r\n## Quick Start\r\n\r\n### Prerequisites\r\n\r\n- Python 3.11+\r\n- OpenAI API key (or Anthropic/Ollama for alternative providers)\r\n\r\n### Installation\r\n\r\n```bash\r\n# Clone repository\r\ngit clone https://github.com/yourusername/rag-document-assistant.git\r\ncd rag-document-assistant\r\n\r\n# Create virtual environment\r\npython -m venv .venv\r\n\r\n# Activate (Windows)\r\n.venv\\Scripts\\activate\r\n\r\n# Activate (Linux/Mac)\r\nsource .venv/bin/activate\r\n\r\n# Install dependencies\r\npip install -e \".[dev]\"\r\n```\r\n\r\n### Environment Setup\r\n\r\n```bash\r\n# Copy example environment file\r\ncp .env.example .env\r\n\r\n# Edit .env with your API keys\r\n# Required: OPENAI_API_KEY (for embeddings)\r\n# Optional: ANTHROPIC_API_KEY (for Claude LLM)\r\n```\r\n\r\n### Run the Application\r\n\r\n```bash\r\n# Start FastAPI backend\r\nuvicorn src.api.main:app --reload\r\n\r\n# API docs available at http://localhost:8000/docs\r\n```\r\n\r\n## Project Structure\r\n\r\n```\r\nrag-document-assistant/\r\n├── src/\r\n│   ├── ingestion/          # Document loaders and chunking\r\n│   ├── vectorstore/        # Embeddings and vector database\r\n│   ├── retrieval/          # RAG chain with memory\r\n│   ├── llm/                # LLM providers and prompts\r\n│   ├── api/                # FastAPI backend\r\n│   └── evaluation/         # MLflow metrics\r\n├── app/\r\n│   └── streamlit_app.py    # Streamlit UI\r\n├── tests/                  # Unit tests (84 tests)\r\n├── data/\r\n│   ├── sample_docs/        # Example documents\r\n│   ├── eval/               # Evaluation test dataset\r\n│   └── experiments/        # Experiment test data\r\n└── docs/\r\n    ├── checkpoints/        # Sprint progress\r\n    ├── decisions/          # Architecture decisions (DEC-###)\r\n    ├── experiments/        # Capability experiments (EXP-###)\r\n    ├── plan/               # Project and sprint plans\r\n    └── dsm-validation-tracker.md  # DSM methodology feedback\r\n```\r\n\r\n## API Endpoints\r\n\r\n| Endpoint | Method | Description |\r\n|----------|--------|-------------|\r\n| `/health` | GET | Health check |\r\n| `/api/v1/ingest` | POST | Upload and index documents |\r\n| `/api/v1/query` | POST | Query the RAG system |\r\n| `/api/v1/models` | GET | List available LLM providers |\r\n| `/api/v1/documents/count` | GET | Get indexed document count |\r\n| `/api/v1/documents` | DELETE | Clear all documents |\r\n\r\n## Configuration\r\n\r\n### LLM Providers\r\n\r\n```python\r\n# OpenAI (default)\r\nLLM_PROVIDER=openai\r\nOPENAI_API_KEY=sk-...\r\n\r\n# Anthropic Claude\r\nLLM_PROVIDER=anthropic\r\nANTHROPIC_API_KEY=sk-ant-...\r\n\r\n# Ollama (local, no API key needed)\r\nLLM_PROVIDER=ollama\r\nOLLAMA_MODEL=llama3.2\r\n```\r\n\r\n### Vector Database\r\n\r\n```python\r\n# ChromaDB (default, local)\r\nVECTOR_DB=chroma\r\nCHROMA_PERSIST_DIR=./chroma_db\r\n```\r\n\r\n## Hardware Requirements\r\n\r\n**No GPU required.** This project uses API-based LLMs and embeddings:\r\n\r\n| Component | Compute Location | Local Requirements |\r\n|-----------|------------------|-------------------|\r\n| LLM Inference | OpenAI/Claude API | Internet connection |\r\n| Embeddings | OpenAI API | Internet connection |\r\n| Vector Search | Local (ChromaDB) | ~4GB RAM |\r\n| Ollama (optional) | Local CPU/GPU | 8GB+ RAM |\r\n\r\nWorks on Windows, macOS, and Linux.\r\n\r\n## Development\r\n\r\n```bash\r\n# Run tests\r\npytest tests/ -v --cov=src\r\n\r\n# Format code\r\nblack src/ tests/\r\nruff check src/ tests/\r\n\r\n# Type checking\r\nmypy src/\r\n```\r\n\r\n## Evaluation with MLflow\r\n\r\n```bash\r\n# Start MLflow UI\r\nmlflow ui\r\n\r\n# Run evaluation pipeline\r\npython -m src.evaluation.runner\r\n```\r\n\r\nTracked metrics:\r\n- **Faithfulness**: Answer grounded in retrieved context\r\n- **Relevance**: Retrieved chunks match query intent\r\n- **Latency**: End-to-end response time\r\n\r\n## Tech Stack\r\n\r\n| Category | Technologies |\r\n|----------|--------------|\r\n| **LLM Orchestration** | LangChain |\r\n| **LLM Providers** | OpenAI, Anthropic, Ollama |\r\n| **Vector Database** | ChromaDB |\r\n| **Embeddings** | OpenAI text-embedding-ada-002 |\r\n| **Backend** | FastAPI, Pydantic, uvicorn |\r\n| **Evaluation** | MLflow |\r\n| **Testing** | pytest, pytest-cov (84 tests, 73% coverage) |\r\n\r\n## Experiments\r\n\r\nBuilt using [Take AI Bite](https://github.com/albertodiazdurana/take-ai-bite), a framework for human-AI collaboration. Its engine, the Deliberate Systematic Methodology (DSM), provides structured guidance for experiment-driven development to validate features systematically:\r\n\r\n| Experiment | Focus | Status | Documentation |\r\n|------------|-------|--------|---------------|\r\n| EXP-001 | Multi-source conflict detection | Complete | [View](docs/experiments/EXP-001_multi-source-detection.md) |\r\n| EXP-002 | Cross-lingual retrieval | Planned | Sprint 3, Day 7 |\r\n| EXP-003 | Retrieval strategy comparison | Planned | Sprint 3, Day 8 |\r\n| EXP-004 | Performance \u0026 robustness | Planned | Sprint 3, Day 9 |\r\n| EXP-005 | End-to-end validation | Planned | Sprint 3, Day 10 |\r\n\r\nEach experiment follows the [DSM C.1.3 Capability Experiment Template](https://github.com/albertodiazdurana/take-ai-bite) with combined quantitative (RAGAS, RAGBench) and qualitative evaluation.\r\n\r\n**DSM Validation:** This project also validates the DSM methodology itself. Feedback is tracked in [dsm-validation-tracker.md](docs/dsm-validation-tracker.md).\r\n\r\n## Known Limitations\r\n\r\nLimitations are tracked per [DSM C.1.5 Limitation Discovery Protocol](https://github.com/albertodiazdurana/take-ai-bite). Current limitations from [EXP-001](docs/experiments/EXP-001_multi-source-detection.md):\r\n\r\n| Limitation | Severity | Disposition | Workaround |\r\n|------------|----------|-------------|------------|\r\n| Simple queries may only cite one source | Medium | Accept MVP | Ask \"What do all documents say about X?\" |\r\n| No automatic version/date awareness | Low | Defer | Name files with dates (e.g., `policy_2024.md`) |\r\n| Documents persist until manually cleared | Low | Accept MVP | Use \"Clear All Documents\" button in UI |\r\n| Relies on LLM reasoning for conflict detection | Medium | Defer | Ask explicitly about differences between sources |\r\n\r\n**Getting Better Results:**\r\n- Ask \"Compare sources on X\" to see differences\r\n- Ask \"Are there different answers for X?\" for comprehensive coverage\r\n- Use specific questions for precise answers\r\n\r\n## Roadmap\r\n\r\n### Sprint 1: Core RAG System (Complete)\r\n- [x] Project setup and document ingestion pipeline\r\n- [x] Vector database integration (ChromaDB)\r\n- [x] RAG chain with multi-provider LLM support\r\n- [x] FastAPI backend with REST API\r\n- [x] MLflow evaluation framework\r\n- [x] 84 tests, 73% coverage\r\n\r\n### Sprint 2: User Interface (Complete)\r\n- [x] Streamlit UI with HTTP backend communication\r\n- [x] Multi-file document upload\r\n- [x] Chat interface with source citations\r\n- [x] EXP-001: Multi-source conflict detection experiment\r\n\r\n### Sprint 3: Experiments \u0026 Optimization (In Progress)\r\n\r\n*Experiment-driven development following [DSM v1.3.1](https://github.com/albertodiazdurana/take-ai-bite) methodology.*\r\n\r\n| Day | Feature | Experiment |\r\n|-----|---------|------------|\r\n| 7 | German Language Support | EXP-002: Cross-lingual retrieval |\r\n| 8 | Hybrid Search (BM25 + semantic) | EXP-003: Retrieval strategy comparison |\r\n| 9 | Performance \u0026 Robustness | EXP-004: Latency \u0026 adversarial testing |\r\n| 10 | Docker Deployment | EXP-005: End-to-end validation |\r\n\r\nSee [Sprint 3 Plan](docs/plan/sprint-3-plan.md) for details.\r\n\r\n## License\r\n\r\nMIT License\r\n\r\n## Author\r\n\r\nAlberto Diaz Durana\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falbertodiazdurana%2Frag-document-assistant","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Falbertodiazdurana%2Frag-document-assistant","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falbertodiazdurana%2Frag-document-assistant/lists"}