{"id":29792268,"url":"https://github.com/armanjscript/multi-hop-rag-chatbot","last_synced_at":"2025-08-20T02:10:22.482Z","repository":{"id":305319792,"uuid":"1022565962","full_name":"armanjscript/Multi-Hop-RAG-Chatbot","owner":"armanjscript","description":"A powerful Streamlit-Based web application that allows users to upload PDF documents and interact with an AI chatbot powered by a multi-hop retrieval-augmented generation (RAG) technique.","archived":false,"fork":false,"pushed_at":"2025-07-19T11:00:10.000Z","size":7,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-07-28T01:09:27.514Z","etag":null,"topics":["chromadb","langchain","langchain-chroma","langchain-ollama","pypdf","pypdfloader","rag","rag-chatbot","streamlit"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/armanjscript.png","metadata":{"files":{"readme":"README.markdown","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-07-19T10:54:17.000Z","updated_at":"2025-07-19T11:02:11.000Z","dependencies_parsed_at":"2025-07-19T16:03:43.745Z","dependency_job_id":"d2046c46-2788-4c26-9afa-525a32ef41b0","html_url":"https://github.com/armanjscript/Multi-Hop-RAG-Chatbot","commit_stats":null,"previous_names":["armanjscript/multi-hop-rag-chatbot"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/armanjscript/Multi-Hop-RAG-Chatbot","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/armanjscript%2FMulti-Hop-RAG-Chatbot","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/armanjscript%2FMulti-Hop-RAG-Chatbot/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/armanjscript%2FMulti-Hop-RAG-Chatbot/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/armanjscript%2FMulti-Hop-RAG-Chatbot/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/armanjscript","download_url":"https://codeload.github.com/armanjscript/Multi-Hop-RAG-Chatbot/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/armanjscript%2FMulti-Hop-RAG-Chatbot/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":271252993,"owners_count":24726918,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-20T02:00:09.606Z","response_time":69,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chromadb","langchain","langchain-chroma","langchain-ollama","pypdf","pypdfloader","rag","rag-chatbot","streamlit"],"created_at":"2025-07-28T01:06:08.373Z","updated_at":"2025-08-20T02:10:22.456Z","avatar_url":"https://github.com/armanjscript.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Multi-Hop RAG Chatbot\n\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Python 3.8+](https://img.shields.io/badge/Python-3.8%2B-blue.svg)](https://www.python.org/downloads/)\n\nA powerful Streamlit-Based web application that allows users to upload PDF documents and interact with an AI chatbot powered by a multi-hop retrieval-augmented generation (RAG) technique. Designed for researchers, students, and knowledge enthusiasts, this chatbot processes complex queries by breaking them into sub-questions, retrieving relevant document chunks, and generating detailed, conversational answers.\n\n## Features\n\n- **PDF Document Upload**: Upload multiple PDF files for processing and indexing.\n- **Multi-Hop RAG**: Breaks down complex queries into sub-questions for precise information retrieval.\n- **Conversational AI**: Delivers detailed, conversational answers based on document content.\n- **Interactive Interface**: Built with Streamlit for a seamless and intuitive user experience.\n- **Document Management**: Easily clear uploaded documents to start fresh.\n- **Retrieval Details**: Displays sub-questions and retrieved document snippets for transparency.\n\n## Technologies Used\n\n| Technology            | Description                                                  |\n|-----------------------|--------------------------------------------------------------|\n| **Python 3.8+**       | Core programming language for the application.                |\n| **Streamlit**         | Creates the interactive web-based user interface.             |\n| **LangChain**         | Framework for building applications with language models.     |\n| **LangChain-Ollama**  | Integrates Ollama for text generation and embeddings.         |\n| **Chroma**            | Vector database for storing and retrieving document chunks.   |\n| **PyPDFLoader**       | Loads and parses PDF documents.                               |\n| **RecursiveCharacterTextSplitter** | Splits documents into manageable chunks for indexing. |\n\n## Prerequisites\n\n- **Python 3.8 or later**: Download from [python.org](https://www.python.org/downloads/).\n- **Ollama**: Local AI server for running language models. Install from [ollama.com](https://ollama.com).\n- **Internet Access**: Required for downloading Ollama models and accessing the app.\n\n## Installation\n\n1. **Install Python**:\n   - Ensure Python 3.8 or later is installed. Verify with:\n     ```bash\n     python --version\n     ```\n\n2. **Install Ollama**:\n   - Follow instructions at [ollama.com](https://ollama.com).\n   - Pull the required models:\n     ```bash\n     ollama pull qwen2.5:latest\n     ollama pull nomic-embed-text:latest\n     ```\n   - Start the Ollama server:\n     ```bash\n     ollama serve\n     ```\n\n3. **Clone the Repository**:\n   ```bash\n   git clone https://github.com/armanjscript/Multi-Hop-RAG-Chatbot.git\n   cd Multi-Hop-RAG-Chatbot\n   ```\n\n4. **Install Python Libraries**:\n   ```bash\n   pip install streamlit langchain langchain-ollama langchain-chroma langchain-community\n   ```\n\n## Usage\n\n1. **Run the Application**:\n   - Ensure the Ollama server is running:\n     ```bash\n     ollama serve\n     ```\n   - Start the Streamlit app:\n     ```bash\n     streamlit run multi_hop_rag.py\n     ```\n   - Open your browser and navigate to `http://localhost:8501`.\n\n2. **Upload Documents**:\n   - In the sidebar, use the file uploader to select PDF files.\n   - Click \"Process Documents\" to load and index the documents into the vector store.\n\n3. **Chat with the AI**:\n   - Enter a question in the chat input box (e.g., \"What is the capital of France and its population?\").\n   - The chatbot will break the question into sub-questions, retrieve relevant document chunks, and generate a detailed answer.\n   - View retrieval details (sub-questions and document snippets) in the expandable section.\n\n4. **Clear Documents**:\n   - Click \"Clear Documents\" in the sidebar to remove all uploaded files and reset the vector store.\n\n## Multi-Hop RAG Technique\n\nThe multi-hop RAG technique enhances the chatbot’s ability to handle complex queries by breaking them into smaller, manageable sub-questions. Here’s how it works:\n\n1. **Question Decomposition**:\n   - The user’s query is analyzed by the `qwen2.5:latest` model to generate 2-3 sub-questions.\n   - Example: For \"What is the capital of France and its population?\", sub-questions might be:\n     - \"What is the capital of France?\"\n     - \"What is the population of Paris?\"\n\n2. **Document Retrieval**:\n   - For each sub-question, the Chroma vector store retrieves the top 3 relevant document chunks using the `nomic-embed-text:latest` embeddings.\n   - Duplicates are removed to ensure efficiency.\n\n3. **Answer Generation**:\n   - The retrieved documents are combined and passed to the `qwen2.5:latest` model along with the original question.\n   - The model generates a detailed, conversational answer based on the context.\n\n### Diagram of the Multi-Hop RAG Technique\n\n```\n[User Question] --\u003e [Generate Sub-Questions] --\u003e [Retrieve Docs for Each Sub-Question] --\u003e [Combine Unique Docs] --\u003e [Generate Answer]\n```\n\n**Example Workflow**:\n- **Input**: \"What is the capital of France and its population?\"\n- **Sub-Questions**: \n  - \"What is the capital of France?\"\n  - \"What is the population of Paris?\"\n- **Retrieval**: Fetch relevant document chunks for each sub-question.\n- **Combination**: Merge unique document chunks.\n- **Output**: Generate a response like: \"The capital of France is Paris, with a population of approximately 2.2 million.\"\n\n## Limitations\n\n- **Document Quality**: Answer accuracy depends on the quality and relevance of uploaded PDFs.\n- **Model Performance**: The effectiveness of sub-question generation and answer quality relies on the `qwen2.5:latest` model.\n- **File Handling**: Uploaded PDFs are saved locally and deleted when cleared, which may affect other files with the same name.\n- **Processing Time**: Large PDFs or complex queries may take longer to process.\n\n## Contributing\n\nContributions are welcome! To contribute:\n- Fork the repository.\n- Make changes, ensuring they align with the project’s coding style.\n- Submit a pull request with a clear description of your changes.\n- Include tests to maintain quality.\n\n## License\n\nThis project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.\n\n## Disclaimer\n\nThis tool is for informational purposes only. Answers depend on the quality of uploaded documents and model performance. Always verify critical information with reliable sources.\n\n## Contact\n\nFor questions or feedback, contact [Arman Daneshdoost] at [armannew73@gmail.com].","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Farmanjscript%2Fmulti-hop-rag-chatbot","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Farmanjscript%2Fmulti-hop-rag-chatbot","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Farmanjscript%2Fmulti-hop-rag-chatbot/lists"}