{"id":30516105,"url":"https://github.com/shreyas2409/book-rag","last_synced_at":"2026-02-18T01:05:22.149Z","repository":{"id":309985353,"uuid":"1038285175","full_name":"Shreyas2409/book-rag","owner":"Shreyas2409","description":"Built a secure multi-modal retrieval system for document ingestion using Neo4j vector database and LangChain orchestration. Engineered ReAct reasoning prompts and LLM-as-Judge evaluation prompts to enhance answer faithfulness and grounding, driving a 70% increase in user engagement.","archived":false,"fork":false,"pushed_at":"2026-02-17T06:19:00.000Z","size":70,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-02-17T11:55:02.254Z","etag":null,"topics":["neo4j","observability","ollama","prompt-engineering","python3","rag-chatbot","streamlit","vector-database"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Shreyas2409.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-08-14T23:36:13.000Z","updated_at":"2026-02-17T06:35:46.000Z","dependencies_parsed_at":"2025-08-15T01:28:35.487Z","dependency_job_id":null,"html_url":"https://github.com/Shreyas2409/book-rag","commit_stats":null,"previous_names":["shreyas2409/book-rag"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Shreyas2409/book-rag","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Shreyas2409%2Fbook-rag","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Shreyas2409%2Fbook-rag/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Shreyas2409%2Fbook-rag/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Shreyas2409%2Fbook-rag/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Shreyas2409","download_url":"https://codeload.github.com/Shreyas2409/book-rag/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Shreyas2409%2Fbook-rag/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29565018,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-18T00:47:08.760Z","status":"ssl_error","status_checked_at":"2026-02-18T00:45:26.718Z","response_time":100,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["neo4j","observability","ollama","prompt-engineering","python3","rag-chatbot","streamlit","vector-database"],"created_at":"2025-08-26T09:29:00.216Z","updated_at":"2026-02-18T01:05:22.143Z","avatar_url":"https://github.com/Shreyas2409.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Chat with Your Book — v2 (Text + Images + Eval + ReAct)\n\nThis project allows you to **upload a PDF or textbook** and interact with it using both **text** and **images**. It retrieves relevant passages and diagrams using **multi-modal search** and processes everything locally with **Ollama**.\n\nThe agent uses the **ReAct (Reasoning + Acting)** paradigm — the LLM explicitly reasons step-by-step, decides which tools to call, observes the results, and iterates before producing a final grounded answer.\n\n**v2 adds**: ReAct agent, observability tracing, hierarchical chunking, conversation memory, and an LLM-as-Judge evaluation agent.\n\n---\n\n## Features\n\n### Core\n- **ReAct Agent** — step-by-step Thought/Action/Observation reasoning loop.\n- **Upload PDF/TXT** from the browser.\n- **Automatic ingestion**: hierarchical text chunking + CLIP image embeddings.\n- **Interactive chat** with multi-turn conversation memory.\n- **Multi-modal retrieval**: text + image search.\n- **Runs locally** using Ollama (llama3.1:8b).\n- **One-click Docker Compose deployment**.\n\n### ReAct Agent\n\nThe agent follows the ReAct paradigm (Yao et al., 2022) instead of simple tool-calling. Each query triggers an explicit reasoning loop:\n\n```\nQuestion: What is the structure of DNA?\n\nThought:  I need to find information about DNA structure in the book.\nAction:   book_text_retriever\nInput:    DNA structure double helix\nObservation: [Passage 1 | Page 42] DNA consists of two polynucleotide\n             strands that wind around each other to form a double helix...\n\nThought:  I have text but the user might benefit from a diagram reference.\nAction:   book_image_retriever\nInput:    DNA double helix structure diagram\nObservation: Image ID: img_042, Page: 43, Similarity: 0.82\n\nThought:  I now have both text and a visual reference. I can answer.\nFinal Answer: DNA has a double helix structure consisting of two\n              complementary strands... (see diagram on page 43)\n```\n\n**Why ReAct over simple tool-calling:**\n- **Transparent reasoning** — you can see WHY the agent chose each tool.\n- **Multi-step refinement** — if the first search is not enough, the agent reasons about it and tries a better query.\n- **Better grounding** — the explicit Observation step forces the model to base its answer on retrieved content.\n- **Debuggable** — the full reasoning chain is captured in the observability trace.\n\nConfigurable via `MAX_REACT_STEPS` (default: 6 loops per query).\n\n### Observability\n- **Structured JSON logging** across all services.\n- **Span-based tracing** — every request generates a trace with timing for each phase (retrieval, LLM, eval).\n- **Metrics dashboard** — avg/p95 latency, chunk counts, similarity scores.\n- **Span waterfall view** — visualise where time is spent per request.\n- Endpoints: `GET /traces`, `GET /traces/{id}`, `GET /metrics`.\n\n### Hierarchical Chunking\n- **Parent chunks** (2000 chars) + **child chunks** (500 chars).\n- **Semantic-aware splitting** — respects paragraph and section boundaries.\n- **Section header detection** — auto-detects chapters and section titles.\n- **Enriched metadata** — chunk_id, parent_id, section, chunk_type on every chunk.\n- **200-char overlap** to avoid mid-sentence cuts.\n\n### Context Window\n- **Sliding-window conversation memory** (configurable, default 10 turns).\n- **History summarisation** — older turns are compressed via LLM to save tokens.\n- **Parent-chunk expansion** — when a child chunk matches, the full parent is included for richer context.\n- **Context budget** — total prompt capped at 12,000 chars (configurable) so the model never overflows.\n\n### Eval Agent / Judge\n- **LLM-as-a-Judge** evaluates every response on four dimensions:\n  - **Faithfulness** — does the answer stick to the retrieved context?\n  - **Relevance** — does it address the user's question?\n  - **Completeness** — does it cover key information from context?\n  - **Hallucination-free** — does it avoid stating ungrounded facts?\n- Scores are **0.0 - 1.0**, displayed inline as colored badges.\n- **Eval Dashboard** — aggregate quality metrics across all queries.\n- **Manual evaluation** — paste any Q/A pair to test the judge.\n- Uses the same local Ollama model — **zero extra cost**.\n\n---\n\n## Architecture\n\n```\n                          ┌──────────────────────────────────────────────┐\n┌──────────────┐          │            Agent API (ReAct)                 │\n│  Streamlit   │──POST──▶ │                                              │\n│     UI       │  /chat   │   Question                                   │\n│              │          │      |                                       │\n│  - Chat      │          │      v                                       │\n│  - Metrics   │          │   Thought: \"I need to search the book...\"    │\n│  - Eval      │          │      |                                       │\n└──────────────┘          │      v                                       │\n                          │   Action: book_text_retriever ──────────┐    │\n                          │      |                                  |    │\n                          │   Observation: [passages...]   ◄────────┘    │\n                          │      |                                       │\n                          │      v                                       │\n                          │   Thought: \"I should check for diagrams...\"  │\n                          │      |                                       │\n                          │      v                                       │\n                          │   Action: book_image_retriever ─────────┐   │\n                          │      |                                  |   │\n                          │   Observation: [images...]     ◄────────┘   │\n                          │      |                                       │\n                          │      v                                       │\n                          │   Thought: \"I have enough info.\"             │\n                          │      |                                       │\n                          │      v                                       │\n                          │   Final Answer ──▶ Eval Judge ──▶ Response   │\n                          └──────────────────────────────────────────────┘\n                                  |                       |\n                          ┌───────▼───────┐       ┌───────▼────────┐\n                          │ mcp_vectordb  │       │ mcp_image_     │\n                          │ (text search) │       │ retriever      │\n                          └───────┬───────┘       │ (CLIP search)  │\n                                  |               └───────┬────────┘\n                          ┌───────▼───────────────────────▼────────┐\n                          │                Neo4j                    │\n                          │  - Text chunks (parent + child)        │\n                          │  - Image embeddings (CLIP)             │\n                          └────────────────────────────────────────┘\n                                          |\n                          ┌───────────────▼────────────────────────┐\n                          │           Ollama (local LLM)           │\n                          │  llama3.1:8b (generation + eval)       │\n                          └────────────────────────────────────────┘\n```\n\n---\n\n## Services Overview\n\n| Service             | Tech Stack                | Purpose                            |\n|---------------------|---------------------------|------------------------------------|\n| ui                  | Streamlit                 | Chat + Observability + Eval UI     |\n| agent_api           | FastAPI + LangChain ReAct | ReAct orchestration, tracing, eval |\n| mcp_vectordb        | FastAPI + Neo4j/Chroma    | Text retrieval with scores         |\n| mcp_image_retriever | FastAPI + CLIP + Neo4j    | Image retrieval                    |\n| neo4j               | Neo4j 5.20                | Vector + graph storage             |\n| ingest_book         | Python + LangChain + CLIP | Hierarchical chunking + indexing   |\n\n### Modules\n\n| Module              | Description                                          |\n|---------------------|------------------------------------------------------|\n| `observability.py`  | Structured logging, span tracing, metrics store      |\n| `chunking.py`       | Hierarchical parent-child chunking with metadata     |\n| `context_window.py` | Conversation memory, prompt builder, summarisation   |\n| `eval_agent.py`     | LLM-as-Judge evaluation (4 quality dimensions)       |\n\n---\n\n## Installation\n\n### 1. Prerequisites\n- Docker: https://docs.docker.com/get-docker/\n- Ollama installed locally:\n```bash\nollama pull llama3.1:8b\n```\n\n### 2. Clone the repo\n```bash\ngit clone https://github.com/Shreyas2409/book-rag.git\ncd book-rag\n```\n\n### 3. Create .env file\n```bash\ncp .env.example .env\n```\n\nKey settings in `.env`:\n```bash\nNEO4J_URI=neo4j://neo4j:7687\nNEO4J_USER=neo4j\nNEO4J_PASS=pass\nAGENT_API=http://localhost:7005\nEVAL_ENABLED=true              # enable/disable eval judge\nMAX_REACT_STEPS=6              # max reasoning loops per query\nMAX_HISTORY_TURNS=10           # conversation memory window\nCONTEXT_BUDGET_CHARS=12000     # max context chars in prompt\nLOG_LEVEL=INFO                 # DEBUG for verbose logging\n```\n\n### 4. Start all services\n```bash\ndocker compose -f docker.yaml up --build\n```\n\n---\n\n## Ingesting a Book\n\nUpload via the UI sidebar, or via terminal:\n```bash\ndocker compose -f docker.yaml run ingest_book python ingestion.py /books/mybook.pdf\n```\n\nThe ingestion pipeline will:\n1. Load the PDF/TXT\n2. Create **hierarchical chunks** (parent 2000 chars + child 500 chars)\n3. Detect **section headers** and enrich metadata\n4. Store everything in Neo4j (or Chroma fallback)\n5. Extract + embed images with CLIP\n\n---\n\n## Chatting\n\nVisit: **http://localhost:8501**\n\nThe chat supports:\n- Multi-turn conversations with memory\n- Inline eval scores (colored badges) on every response\n- ReAct step count showing how many reasoning loops the agent took\n- Sidebar tabs to switch between Chat, Observability, and Eval Dashboard\n\n---\n\n## Observability\n\nAccess via the **Observability** tab in the UI, or directly:\n- `GET http://localhost:7005/metrics` — aggregate stats\n- `GET http://localhost:7005/traces?n=20` — recent traces\n- `GET http://localhost:7005/traces/{trace_id}` — single trace detail\n\nEach trace includes:\n- ReAct reasoning chain (Thought/Action/Observation steps)\n- Span waterfall (retrieval, prompt, LLM, eval timings)\n- Chunks retrieved + similarity scores\n- Eval scores\n- Answer preview\n\n---\n\n## Eval Dashboard\n\nAccess via the **Eval Dashboard** tab. Features:\n- **Aggregate scores** across all queries (faithfulness, relevance, etc.)\n- **Per-query breakdown** with judge reasoning\n- **Manual evaluation** — paste any Q/A to test the judge independently\n\nAPI endpoint for programmatic evaluation:\n```bash\ncurl -X POST http://localhost:7005/eval \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"question\": \"What is DNA?\", \"context_chunks\": [\"DNA is ...\"], \"answer\": \"DNA is ...\"}'\n```\n\n---\n\n## API Reference\n\n| Method | Endpoint | Description |\n|--------|----------|-------------|\n| POST | `/chat` | Send a question, get a ReAct-generated answer with eval scores |\n| GET | `/traces` | List recent traces (query param: `n`) |\n| GET | `/traces/{id}` | Get a single trace with full reasoning chain |\n| GET | `/metrics` | Aggregate performance and quality metrics |\n| POST | `/eval` | Run the eval judge on a custom Q/A pair |\n| GET | `/health` | Service status, agent type, config |\n\n---\n\n## Known Issues\n- Neo4j plugin `graph-data-science` must be enabled for image cosine similarity.\n- Ensure Ollama is running before starting services.\n- First query may be slow as models load into memory.\n- Eval adds ~2-5s latency per query (set `EVAL_ENABLED=false` to disable).\n- ReAct occasionally produces malformed output; `handle_parsing_errors=True` recovers gracefully.\n\n---\n\n## License\nMIT License\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshreyas2409%2Fbook-rag","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fshreyas2409%2Fbook-rag","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshreyas2409%2Fbook-rag/lists"}