{"id":47932002,"url":"https://github.com/salimomrani/ragkit","last_synced_at":"2026-04-04T07:19:13.422Z","repository":{"id":339669422,"uuid":"1162910700","full_name":"salimomrani/ragkit","owner":"salimomrani","description":"RAG API + Angular UI — FastAPI · LangChain · ChromaDB · PostgreSQL · Ollama · Angular 21","archived":false,"fork":false,"pushed_at":"2026-03-19T21:44:02.000Z","size":649,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-03-21T02:55:16.496Z","etag":null,"topics":["angular","chromadb","fastapi","langchain","ollama","postgresql","python","rag"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/salimomrani.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-20T21:07:50.000Z","updated_at":"2026-03-19T21:57:20.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/salimomrani/ragkit","commit_stats":null,"previous_names":["salimomrani/palo-rag","salimomrani/ragkit"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/salimomrani/ragkit","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/salimomrani%2Fragkit","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/salimomrani%2Fragkit/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/salimomrani%2Fragkit/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/salimomrani%2Fragkit/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/salimomrani","download_url":"https://codeload.github.com/salimomrani/ragkit/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/salimomrani%2Fragkit/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31391254,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-04T04:26:24.776Z","status":"ssl_error","status_checked_at":"2026-04-04T04:23:34.147Z","response_time":60,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["angular","chromadb","fastapi","langchain","ollama","postgresql","python","rag"],"created_at":"2026-04-04T07:19:12.771Z","updated_at":"2026-04-04T07:19:13.410Z","avatar_url":"https://github.com/salimomrani.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# RagKit — Enterprise Knowledge Assistant\n\n[![CI](https://github.com/salimomrani/ragkit/actions/workflows/ci.yml/badge.svg)](https://github.com/salimomrani/ragkit/actions/workflows/ci.yml)\n\nRAG API + Angular UI for enterprise knowledge bases — answers questions from internal documents, refuses when confidence is too low.\n\n**Stack**: Python 3.12 · FastAPI · LangChain 0.3 · ChromaDB · PostgreSQL 16 · Ollama · Angular 21\n\n---\n\n## Overview\n\nThis project demonstrates a production-minded RAG assistant built around three engineering constraints: **hallucination control**, **traceability**, and **answer quality evaluation**.\n\nThe assistant is intentionally constrained — it only answers using retrieved documents. If the information is absent from the corpus, it refuses rather than hallucinating. Reliability over creativity.\n\n---\n\n## Architecture\n\n```\n┌─────────────────────────────┐\n│          Angular UI         │\n│  Chat · Ingest · Logs · Eval│\n└─────────────┬───────────────┘\n              │  HTTP / SSE\n              ▼\n┌─────────────────────────────┐\n│      FastAPI  /api/v1       │\n└─────────────┬───────────────┘\n              │\n              ▼\n┌─────────────────────────────┐          ┌────────────┐\n│         Guardrails          │ ── ✗ ──▶ │  rejected  │\n└─────────────┬───────────────┘          └────────────┘\n              │  ✓\n              ▼\n┌─────────────────────────────┐\n│      Embed  (Ollama)        │\n└─────────────┬───────────────┘\n              │\n              ▼\n┌─────────────────────────────┐\n│         ChromaDB            │ ──▶  top-k chunks\n└─────────────┬───────────────┘\n              │\n              ▼\n┌─────────────────────────────┐\n│       LLM  (Ollama)         │ ──▶  answer\n└─────────────┬───────────────┘\n              │\n              ▼\n┌─────────────────────────────┐\n│        PostgreSQL           │  (audit log)\n└─────────────────────────────┘\n```\n\n---\n\n## How It Works\n\n### 1. Ingestion\n- Upload a `.md` file via the UI or API (`/ingest`)\n- Backend splits text into chunks (500 chars, overlap 50)\n- Chunks are embedded via Ollama and stored in ChromaDB\n- Document metadata is persisted in PostgreSQL\n\n### 2. Query (RAG)\n- User submits a question (`/query` or `/query/stream`)\n- Guardrails validate input: length, injection patterns, offensive content\n- Top-4 chunks retrieved from ChromaDB by semantic similarity\n- If retrieval score \u003c `MIN_RETRIEVAL_SCORE` (default `0.3`), the system refuses\n- Otherwise, Ollama generates the answer grounded in retrieved context\n\n### 3. Traceability\n- Every query is logged in PostgreSQL: masked question, retrieved sources, confidence score, latency, guardrail status\n- Evaluation suite available at `/evaluation/run`\n\n---\n\n## Prerequisites\n\n- [Ollama](https://ollama.ai) running locally\n- Docker (for PostgreSQL)\n- Python 3.12, Node.js 22\n\n```bash\n# Pull required models once\nollama pull qwen2.5:7b\nollama pull mxbai-embed-large\n```\n\n---\n\n## Setup (3 steps)\n\n```bash\n# 1. Start PostgreSQL\ndocker-compose up -d\n\n# 2. Start backend\ncd backend\npython3.12 -m venv .venv \u0026\u0026 .venv/bin/pip install -r requirements.txt\ncp .env.example .env          # defaults: localhost:5444, palo/palo\n.venv/bin/python scripts/ingest_corpus.py   # load 16 corpus docs\n.venv/bin/uvicorn main:app --reload --port 8000\n\n# 3. Start frontend (new terminal)\ncd frontend\nnpm install\nnpm start\n```\n\nOpen **http://localhost:4200** · API docs: **http://localhost:8000/docs**\n\n### Runtime tuning (`backend/.env`)\n\n```bash\nLLM_TEMPERATURE=0.1           # [0.0–2.0]  lower = more deterministic\nTOP_K=4                       # [1–20]     chunks retrieved per query\nMIN_RETRIEVAL_SCORE=0.3       # [0.0–1.0]  below this = refusal\nLOW_CONFIDENCE_THRESHOLD=0.5  # [0.0–1.0]  above MIN but uncertain = flagged\nCHUNK_SIZE=500                # [100–2000] chars per chunk at ingestion\nCHUNK_OVERLAP=50              # [0–500]    overlap between chunks\nGUARDRAIL_MAX_LENGTH=500      # [50–5000]  max question length\nDEFAULT_LOGS_LIMIT=100        # [1–1000]   max entries from GET /logs\nCORS_ALLOW_ORIGINS=http://localhost:4200\n```\n\n---\n\n## API\n\nBase URL: `http://localhost:8000/api/v1`\n\n| Method | Endpoint | Description |\n|--------|----------|-------------|\n| `POST` | `/query` | Ask a question (blocking) |\n| `POST` | `/query/stream` | Ask a question (SSE streaming) |\n| `POST` | `/ingest` | Ingest a document `{text, name}` |\n| `GET` | `/documents` | List ingested documents |\n| `DELETE` | `/documents/{id}` | Delete a document |\n| `GET` | `/logs` | Audit log of all queries |\n| `POST` | `/evaluation/run` | Run quality evaluation |\n| `GET` | `/evaluation/report` | Get latest evaluation report |\n| `GET` | `/health` | Health check |\n\n### Example\n\n```bash\ncurl -X POST http://localhost:8000/api/v1/ingest \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"text\": \"Acme Corp est une entreprise fondée en 2009.\", \"name\": \"about.md\"}'\n\ncurl -X POST http://localhost:8000/api/v1/query \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"question\": \"Quand Acme Corp a-t-elle été fondée ?\"}'\n```\n\nSample log entry (`GET /api/v1/logs`):\n```json\n{\n  \"id\": \"24369911-0190-42bb-8b32-b069b192b3d3\",\n  \"timestamp\": \"2026-02-20T14:23:33.856095\",\n  \"question_masked\": \"Que dit smoke.md ?\",\n  \"retrieved_sources\": [\"faq-onboarding.md\", \"spec-webhooks.md\"],\n  \"similarity_scores\": [0.455, 0.441],\n  \"answer\": \"Je n'ai pas d'information sur ce sujet dans la base de connaissance.\",\n  \"faithfulness_score\": 0.443,\n  \"latency_ms\": 14263,\n  \"guardrail_triggered\": null,\n  \"rejected\": false\n}\n```\n\n---\n\n## Tests \u0026 Quality\n\n```bash\n# Backend tests (TDD — 48 tests)\ncd backend \u0026\u0026 .venv/bin/pytest tests/ -v\n\n# Backend lint (ruff)\ncd backend \u0026\u0026 ruff check .\n\n# Frontend tests (vitest)\ncd frontend \u0026\u0026 npm test\n\n# Frontend lint (ESLint / angular-eslint)\ncd frontend \u0026\u0026 npm run lint\n# Expected: 0 errors\n\n# Quality evaluation\ncurl -X POST http://localhost:8000/api/v1/evaluation/run\n# Report saved to reports/eval.md\n```\n\n### CI/CD (GitHub Actions)\n\nPush or PR on any branch triggers path-filtered jobs:\n\n| Changed path | Jobs triggered |\n|---|---|\n| `backend/**` | `backend-lint` (ruff) → `backend-test` (pytest + PostgreSQL) |\n| `frontend/**` | `frontend-lint` (ESLint) → `frontend-test` (vitest) |\n| Both | All four jobs in parallel |\n\nLint gates tests: tests only run when lint passes. No deployment.\n\n---\n\n## Security\n\nImplemented:\n- Input guardrails (length, prompt-injection patterns, offensive content)\n- PII masking in logs (email, phone)\n- CORS restricted to configured origins\n\nOut of scope (production):\n- Authentication / authorization on management endpoints\n- Rate limiting, secrets rotation, data retention policy\n\n---\n\n## Project Structure\n\n```\nPALO/\n├── .github/workflows/\n│   └── ci.yml           # GitHub Actions: path-filtered lint + test jobs\n├── backend/\n│   ├── api/v1/          # FastAPI routers (query, ingest, logs, evaluation)\n│   ├── rag/             # Pipeline, provider (Ollama), ingestion\n│   ├── guardrails/      # Input validation\n│   ├── logging_service/ # PII masking + audit log\n│   ├── quality/         # Reference dataset, runner, report generator\n│   ├── models/          # SQLAlchemy models\n│   ├── ruff.toml        # Linter config (E/F/I rules, Python 3.12)\n│   └── tests/           # 48 tests (TDD)\n├── frontend/\n│   └── src/app/\n│       ├── components/  # Chat, Ingest, Logs, Eval (Angular 21 signals)\n│       └── services/    # RagApiService\n├── corpus/              # 16 synthetic Markdown knowledge base docs\n├── reports/             # eval.md, costs.md\n└── docker-compose.yml   # PostgreSQL 16\n```\n\n---\n\n## Trade-offs \u0026 Decisions\n\nSee [DECISIONS.md](DECISIONS.md) — architectural decisions, known limitations, and production roadmap.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsalimomrani%2Fragkit","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsalimomrani%2Fragkit","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsalimomrani%2Fragkit/lists"}