{"id":29031647,"url":"https://github.com/pythontogo/rag_chatbot","last_synced_at":"2026-05-04T22:42:48.480Z","repository":{"id":300481464,"uuid":"1006285606","full_name":"PythonToGo/rag_chatbot","owner":"PythonToGo","description":"A Streamlit-based chatbot that uses RAG (Retrieval-Augmented Generation) to answer questions about uploaded PDF documents.","archived":false,"fork":false,"pushed_at":"2025-06-21T23:21:50.000Z","size":17,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-06-22T00:18:49.367Z","etag":null,"topics":["openai","rag-chatbot","streamlit"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/PythonToGo.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-21T23:02:41.000Z","updated_at":"2025-06-21T23:21:53.000Z","dependencies_parsed_at":"2025-06-22T00:28:56.897Z","dependency_job_id":null,"html_url":"https://github.com/PythonToGo/rag_chatbot","commit_stats":null,"previous_names":["pythontogo/rag_chatbot"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/PythonToGo/rag_chatbot","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PythonToGo%2Frag_chatbot","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PythonToGo%2Frag_chatbot/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PythonToGo%2Frag_chatbot/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PythonToGo%2Frag_chatbot/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/PythonToGo","download_url":"https://codeload.github.com/PythonToGo/rag_chatbot/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PythonToGo%2Frag_chatbot/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":262044447,"owners_count":23249750,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["openai","rag-chatbot","streamlit"],"created_at":"2025-06-26T10:04:52.104Z","updated_at":"2026-05-04T22:42:48.450Z","avatar_url":"https://github.com/PythonToGo.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# PDF RAG Chatbot\n\n## just for fun; not yet deployed;\n\n\nA Streamlit-based chatbot that uses RAG (Retrieval-Augmented Generation) to answer questions about uploaded PDF documents.\n\n\n![image](https://github.com/user-attachments/assets/90beed10-6c39-43d0-bbd5-209540a87ad0)\n\n![image](https://github.com/user-attachments/assets/e3deac50-4286-4270-9765-4c1da52a3fa1)\n\n## Features\n\n- **PDF Upload \u0026 Processing**: Upload PDF files and automatically process them for RAG\n- **Vector Storage**: Uses FAISS for efficient document retrieval\n- **RAG Integration**: Leverages OpenAI's GPT models for intelligent question answering\n- **PDF Visualization**: View PDF pages as images alongside responses\n- **Context Display**: See the source documents used to generate answers\n\n## Project Structure\n\n```\nchatbot/\n├── src/\n│   ├── config/\n│   │   ├── __init__.py\n│   │   └── settings.py          # Configuration and constants\n│   ├── core/\n│   │   ├── __init__.py\n│   │   ├── document_processor.py # PDF processing and vector storage\n│   │   └── rag_chain.py         # RAG chain implementation\n│   ├── utils/\n│   │   ├── __init__.py\n│   │   └── pdf_converter.py     # PDF to image conversion\n│   ├── ui/\n│   │   ├── __init__.py\n│   │   └── streamlit_app.py     # Streamlit UI implementation\n│   └── __init__.py\n├── data/\n│   ├── temp_pdfs/              # Temporary PDF storage\n│   ├── vector_store/           # FAISS vector database\n│   └── pdf_images/             # Converted PDF images\n├── tests/                      # Test files\n├── docs/                       # Documentation\n├── main.py                     # Application entry point\n├── requirements.txt            # Python dependencies\n└── README.md                   # This file\n```\n\n## Installation\n\n1. Clone the repository:\n```bash\ngit clone https://github.com/PythonToGo/rag_chatbot.git\ncd chatbot\n```\n\n2. Create a virtual environment:\n```bash\npython -m venv venv\nsource venv/bin/activate\n```\n\n3. Install dependencies:\n```bash\npip install -r requirements.txt\n```\n\n4. Set up environment variables:\nCreate a `.env` file in the root directory with your OpenAI API key:\n```\nOPENAI_API_KEY=your_openai_api_key_here\n```\n\n## Usage\n\n1. Run the application:\n```bash\nstreamlit run main.py\n```\n\n2. Open your browser and navigate to the provided URL (usually `http://localhost:8501`)\n\n3. Upload a PDF file using the file uploader\n\n4. Ask questions about the uploaded PDF in the text input field\n\n5. View the generated responses and related document context\n\n## Configuration\n\nYou can modify the application settings in `src/config/settings.py`:\n\n- **Model Settings**: Change the embedding and chat models\n- **Document Processing**: Adjust chunk size and overlap\n- **Retrieval Settings**: Modify the number of retrieved documents\n- **Image Conversion**: Change DPI settings for PDF to image conversion\n\n## Dependencies\n\n- **Streamlit**: Web application framework\n- **LangChain**: RAG framework and document processing\n- **OpenAI**: Language models and embeddings\n- **FAISS**: Vector similarity search\n- **PyMuPDF**: PDF processing and image conversion\n\n## Development\n\n### Running Tests\n```bash\n# Add test files to the tests/ directory\npython -m pytest tests/\n```\n\n### Code Structure\n\nThe application follows a modular architecture:\n\n- **DocumentProcessor**: Handles PDF loading, chunking, and vector storage\n- **RAGChain**: Manages the RAG pipeline and question processing\n- **PDFConverter**: Converts PDF pages to images for display\n- **StreamlitApp**: Main UI application with clean separation of concerns\n\n### Adding New Features\n\n1. Create new modules in the appropriate directory (`core/`, `utils/`, `ui/`)\n2. Update configuration in `src/config/settings.py` if needed\n3. Add tests in the `tests/` directory\n4. Update this README with new features\n\n## License\nMIT License,\nCopyright PythonToGo 2025.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpythontogo%2Frag_chatbot","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpythontogo%2Frag_chatbot","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpythontogo%2Frag_chatbot/lists"}