{"id":28471760,"url":"https://github.com/tschechlovdev/rag_chatbot","last_synced_at":"2025-10-09T01:36:35.541Z","repository":{"id":296824226,"uuid":"980442024","full_name":"tschechlovdev/rag_chatbot","owner":"tschechlovdev","description":"Simple RAG chatbot built with LangChain and Ollama that chats with your own PDFs. Blog post: https://medium.com/@tschechd/retrieval-augmented-generation-rag-in-practice-implementing-a-chatbot-with-langchain-and-ollama-79d6d19642f7","archived":false,"fork":false,"pushed_at":"2025-06-02T10:34:27.000Z","size":505,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-02T18:55:06.070Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tschechlovdev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-09T06:17:50.000Z","updated_at":"2025-06-02T15:19:47.000Z","dependencies_parsed_at":"2025-06-02T18:58:04.490Z","dependency_job_id":"c2cce9b5-14d6-4f2f-8bb7-3469f0a3c08b","html_url":"https://github.com/tschechlovdev/rag_chatbot","commit_stats":null,"previous_names":["tschechlovdev/rag_chatbot"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/tschechlovdev/rag_chatbot","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tschechlovdev%2Frag_chatbot","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tschechlovdev%2Frag_chatbot/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tschechlovdev%2Frag_chatbot/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tschechlovdev%2Frag_chatbot/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tschechlovdev","download_url":"https://codeload.github.com/tschechlovdev/rag_chatbot/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tschechlovdev%2Frag_chatbot/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263046019,"owners_count":23405117,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-06-07T11:08:45.141Z","updated_at":"2025-10-09T01:36:35.522Z","avatar_url":"https://github.com/tschechlovdev.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# RAG-Chatbot — Chat with Your Own Documents\n\n\nThis project lets you run a fully local **RAG-based (Retrieval-Augmented Generation)** chatbot using your own PDFs or web content. Ask questions in natural language, and get answers based on the actual contents of your documents.\n\n![1_tEelGLOyg6n7oUJ0a1fMUA](https://github.com/user-attachments/assets/ec00a9d7-53f7-4c52-b51c-0c565f92521c)\n\nIt uses the following tools for this:\n- [LangChain](https://www.langchain.com/) for orchestration\n- [FAISS](https://github.com/facebookresearch/faiss) for semantic vector search\n- [Ollama](https://ollama.com) to run open-source LLMs locally\n- [Streamlit](https://streamlit.io) for an easy-to-use chat interface\n\n## ✨ Features\n\n- 📄 Upload PDFs or URLs as your data source\n- 🧠 Store document chunks as embeddings in a FAISS vector store\n- 🔍 Retrieve relevant content using semantic similarity search\n- 💬 Generate context-aware answers via local LLM\n- 💻 All running 100% locally\n\n\n## 🚀 Getting Started\n\n### 1. Install Ollama and Run a Local LLM\n\nMake sure you have [Ollama](https://ollama.com) installed and a compatible model (e.g. `granite3.3`) downloaded.\n\n```bash\n# Install Ollama\ncurl -fsSL https://ollama.com/install.sh | sh\n```\n\n#### Download model\n```bash\nollama pull granite3.3\n```\n\n#### Start the model\n```bash \nollama run granite3.3\n```\n\n### 2. Clone This Repository\n```bash\ngit clone https://github.com/yourname/rag-chatbot.git\ncd rag_chatbot\n```\n\n### 3. Install Python Dependencies\n\nThis project uses Python 3.9+.\n\n```bash\npip install -r requirements.txt\n```\n\n### 4. Run the Streamlit App\n```streamlit run app.py```\n\n\n\n## 📂 File Upload\nOnce the app is running: Go to the sidebar to upload one or more PDF files.\n\nAsk natural questions about the content in the chat interface.\n\nThe chatbot will search for relevant sections and answer using context.\n\n\n\n## 🏛️ Architecture Overview\n![1_gXq3HJeXbPO2aGgFDYh0TA](https://github.com/user-attachments/assets/b492d7a7-d280-40ff-b92b-534cd1c415e7)\n\n- **ChatUI**: The user interface is built with Streamlit, using its built-in chat_message components to create a conversational layout. Users can upload documents in the sidebar and interact with the chatbot in real time.\n- **LLMRAGHandler**: This is the main component that connects everything. It is implemented using LangChain and is responsible for managing the conversation flow, retrieving relevant context from the vector store, formatting prompts using a custom template, calling the LLM, and caching chat history.\n- **Vector Store**: Responsible for storing the documents as vector embeddings in FAISS, a high-speed similarity search library and retrieving the relevant context\n-  **LLM**: The chatbot runs the Granite 3.3 model locally using Ollama. This means: Easy setup and prototyping, easy model switching, and full control over your data (everything stays local\n- **Conversation Store**: To make the chatbot stateful, we store the conversation history in a local file (e.g. JSON). This allows the chat to resume where you left off - even after refreshing the browser.\n\n\n  \n## ⚠️ Limitations\n- Initial PDF parsing and embedding may take a few seconds for large files.\n- Latency depends on the chosen LLM model.\n- Evaluation of answers is qualitative — no scoring function included.\n- Runs only locally for easier development\n\n\n\n## 💡 Ideas for Future Improvements\n- Use agentic RAG (history-aware retrievers, dynamic tool-calling)\n- Tool Calling\n- Other Data Sources (Google Drive, Notion, ...)\n- Cloud deployment\n- UI enhancements and document summarization\n\n\n## 📄 License\nMIT License. See LICENSE for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftschechlovdev%2Frag_chatbot","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftschechlovdev%2Frag_chatbot","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftschechlovdev%2Frag_chatbot/lists"}