{"id":31767678,"url":"https://github.com/gerdguerrero/study-chatbot","last_synced_at":"2026-04-10T00:45:59.477Z","repository":{"id":317166733,"uuid":"1066239007","full_name":"gerdguerrero/Study-Chatbot","owner":"gerdguerrero","description":"AI Study Assistant. Upload PDFs, chat with your study materials, and generate practice exams with answer keys. Built with Streamlit, powered by OpenAI GPT-4, and enhanced with RAG (ChromaDB) for intelligent document retrieval.","archived":false,"fork":false,"pushed_at":"2025-09-29T08:32:11.000Z","size":36,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-09-29T10:22:18.705Z","etag":null,"topics":["embeddings","gpt-4o","gpt4","gpt4o-mini","langchain","openai","openai-api","pdfplumber","pymupdf","pypdf2","python","rag","rag-chatbot","streamlit","streamlit-webapp"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gerdguerrero.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-09-29T08:08:14.000Z","updated_at":"2025-09-29T09:36:17.000Z","dependencies_parsed_at":"2025-09-29T10:22:20.786Z","dependency_job_id":"64922007-87b8-4b59-b7b4-ffe06d680b2f","html_url":"https://github.com/gerdguerrero/Study-Chatbot","commit_stats":null,"previous_names":["gerdguerrero/study-chatbot"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/gerdguerrero/Study-Chatbot","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gerdguerrero%2FStudy-Chatbot","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gerdguerrero%2FStudy-Chatbot/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gerdguerrero%2FStudy-Chatbot/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gerdguerrero%2FStudy-Chatbot/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gerdguerrero","download_url":"https://codeload.github.com/gerdguerrero/Study-Chatbot/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gerdguerrero%2FStudy-Chatbot/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279002397,"owners_count":26083373,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-09T02:00:07.460Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["embeddings","gpt-4o","gpt4","gpt4o-mini","langchain","openai","openai-api","pdfplumber","pymupdf","pypdf2","python","rag","rag-chatbot","streamlit","streamlit-webapp"],"created_at":"2025-10-10T01:18:46.933Z","updated_at":"2025-10-10T01:18:57.413Z","avatar_url":"https://github.com/gerdguerrero.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🤖 AI Study Chatbot with RAG \u0026 Exam Generation\n\n[![Python](https://img.shields.io/badge/Python-3.8%2B-blue)](https://python.org)\n[![OpenAI](https://img.shields.io/badge/OpenAI-GPT--3.5--turbo-green)](https://openai.com)\n[![Streamlit](https://img.shields.io/badge/Streamlit-Web%20App-red)](https://streamlit.io)\n[![ChromaDB](https://img.shields.io/badge/ChromaDB-Vector%20DB-orange)](https://chromadb.com)\n\nA powerful AI-powered study assistant that processes your academic PDFs and enables natural conversation about the content using **OpenAI's API** and **RAG (Retrieval Augmented Generation)** technology.\n\n![Study Chatbot Demo](https://img.shields.io/badge/Status-Production%20Ready-brightgreen)\n\n## ✨ Key Features\n\n- **🤖 Real AI Conversations**: Powered by OpenAI GPT-3.5-turbo for natural language understanding\n- **📚 Smart PDF Processing**: Upload and process academic modules, textbooks, research papers\n- **🔍 RAG Implementation**: Retrieval Augmented Generation for accurate, document-based responses\n- **📝 Intelligent Exam Generation**: Create practice tests with 4 difficulty levels from your materials\n- **🎯 Document-Specific Responses**: AI responses based strictly on your uploaded content\n- **💬 Natural Language Interface**: Ask questions in plain English about your documents\n- **🎓 Educational Focus**: Designed specifically for academic study and learning\n\n## 🚀 Quick Demo\n\n1. **Upload** your study materials (PDFs)\n2. **Ask questions** like:\n   - \"What are the main concepts in this document?\"\n   - \"Explain the key principles of [topic] from my uploaded files\"\n   - \"What does Chapter 3 say about [concept]?\"\n3. **Generate exams** with customizable difficulty:\n   - Easy: Basic recall and definitions\n   - Medium: Application and understanding\n   - Hard: Analysis and synthesis\n   - Expert: Critical thinking and mastery\n\n## 🛠️ Technology Stack\n\n- **AI Engine**: OpenAI API (GPT-3.5-turbo + text-embedding-ada-002)\n- **RAG System**: ChromaDB for vector storage with semantic search\n- **PDF Processing**: PyMuPDF, pdfplumber, PyPDF2 with intelligent text extraction\n- **Backend**: Python with modular architecture\n- **Frontend**: Streamlit for intuitive web interface\n- **Vector Search**: OpenAI embeddings with similarity search\n\n## 📋 Prerequisites\n\n- Python 3.8+\n- OpenAI API key ([Get one here](https://platform.openai.com/api-keys))\n- 2GB+ RAM recommended for vector processing\n\n## ⚡ Quick Start\n\n### 1. Clone \u0026 Install\n```bash\ngit clone https://github.com/yourusername/Study-Chatbot.git\ncd Study-Chatbot\npip install -r requirements.txt\n```\n\n### 2. Configure API Key\n```bash\n# Create .env file\necho \"OPENAI_API_KEY=your_openai_api_key_here\" \u003e .env\n```\n\n### 3. Run the Application\n```bash\nstreamlit run src/app.py\n```\n\n### 4. Start Learning! 🎓\n- Open http://localhost:8501 in your browser\n- Upload your study PDFs\n- Start asking questions!\n\n## 📁 Project Structure\n\n```\nStudy-Chatbot/\n├── src/\n│   ├── app.py              # Streamlit web interface\n│   ├── chatbot.py          # Main chatbot orchestration\n│   ├── rag_system.py       # RAG implementation with ChromaDB\n│   ├── pdf_processor.py    # Advanced PDF text extraction\n│   ├── exam_generator.py   # AI-powered exam generation\n│   └── config.py           # Configuration management\n├── documents/              # Sample documents (optional)\n├── requirements.txt        # Python dependencies\n├── .env.example           # Environment template\n└── README.md              # You are here!\n```\n\n## 💡 Usage Examples\n\n### Chat with Your Documents\n```\nYou: \"What is this document about?\"\nAI: \"This document is a comprehensive guide to Non-Destructive Testing (NDT) \n     methods, covering ultrasonic testing, radiographic inspection, and \n     magnetic particle testing techniques...\"\n\nYou: \"Generate 5 questions about NDT methods\"\nAI: Creates targeted multiple-choice, true/false, and essay questions\n    based on your specific document content.\n```\n\n### Advanced Features\n- **Document Overview**: \"Summarize the key topics in my uploaded files\"\n- **Specific Queries**: \"What does section 4.2 say about ultrasonic testing?\"\n- **Comparative Analysis**: \"Compare the advantages of different NDT methods\"\n- **Exam Generation**: Create custom practice tests with answer keys\n\n## 🔧 Configuration Options\n\n### Environment Variables (.env)\n```env\n# Required\nOPENAI_API_KEY=your_openai_api_key_here\n\n# Optional Customizations\nOPENAI_MODEL=gpt-3.5-turbo\nOPENAI_EMBEDDING_MODEL=text-embedding-ada-002\nCHROMA_PERSIST_DIRECTORY=./embeddings\nMAX_FILE_SIZE_MB=50\n```\n\n### Exam Generation Settings\n- **Question Types**: Multiple choice, True/False, Short answer, Essay\n- **Difficulty Levels**: Easy, Medium, Hard, Expert\n- **Customizable Counts**: Configure questions per type\n- **Answer Keys**: Toggle show/hide functionality\n\n## 🎯 Core AI Capabilities\n\n### RAG (Retrieval Augmented Generation)\n1. **Document Ingestion**: Processes PDFs with advanced text extraction\n2. **Semantic Chunking**: Intelligent text segmentation for optimal retrieval\n3. **Vector Embedding**: OpenAI embeddings for semantic similarity\n4. **Contextual Retrieval**: Finds most relevant document sections\n5. **Response Generation**: AI responses grounded in your content\n\n### Intelligent Content Processing\n- **Multi-format PDF Support**: Handles various PDF types and layouts\n- **Content Quality Filtering**: Removes headers, footers, and noise\n- **Subject-Specific Queries**: Optimizes retrieval for technical content\n- **Overview Generation**: Synthesizes document summaries\n\n## 🚨 Known Limitations\n\n- **PDF-only Support**: Currently limited to PDF documents\n- **English Language**: Optimized for English-language content\n- **OpenAI Dependency**: Requires active OpenAI API subscription\n- **Single Session**: No persistent user accounts (yet)\n\n## 🤝 Contributing\n\nWe welcome contributions! Areas for improvement:\n\n- [ ] Support for more document formats (DOCX, TXT)\n- [ ] Multi-language support\n- [ ] User authentication and session persistence\n- [ ] Advanced analytics and usage tracking\n- [ ] Collaborative study features\n\n## 📄 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## 🙏 Acknowledgments\n\n- **OpenAI** for providing the GPT and embedding APIs\n- **ChromaDB** for the vector database solution\n- **Streamlit** for the amazing web framework\n- **LangChain** for RAG implementation patterns\n\n## 📞 Support\n\n- **Issues**: [GitHub Issues](https://github.com/yourusername/Study-Chatbot/issues)\n- **Documentation**: Check the wiki for advanced usage\n- **Discussions**: Share your use cases and get help\n\n---\n\n**⭐ Star this repository if it helps with your studies!**\n\n*Built with ❤️ for students and educators worldwide*","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgerdguerrero%2Fstudy-chatbot","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgerdguerrero%2Fstudy-chatbot","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgerdguerrero%2Fstudy-chatbot/lists"}