{"id":30114336,"url":"https://github.com/ffiruzi/website-rag-qa-assistant","last_synced_at":"2026-04-09T18:14:36.708Z","repository":{"id":307469069,"uuid":"1029621202","full_name":"ffiruzi/Website-RAG-QA-Assistant","owner":"ffiruzi","description":"Transform any website into an intelligent AI assistant with ONE Docker command. Complete RAG system with web crawling, vector embeddings, and beautiful chat interface. Built with FastAPI, React, and LangChain.","archived":false,"fork":false,"pushed_at":"2025-07-31T10:36:43.000Z","size":247,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-07-31T13:55:00.344Z","etag":null,"topics":["artificial-intelligence","chatbot","docker","fastapi","full-stack","langchain","openai","python","question-answering","rag","rag-chatbot","react","typescript","vector-database","web-crawling"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ffiruzi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-07-31T10:13:32.000Z","updated_at":"2025-07-31T10:38:36.000Z","dependencies_parsed_at":"2025-07-31T13:55:06.123Z","dependency_job_id":"1d93c30e-c293-4285-a198-534d5092de6d","html_url":"https://github.com/ffiruzi/Website-RAG-QA-Assistant","commit_stats":null,"previous_names":["ffiruzi/website-rag-qa-assistant"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/ffiruzi/Website-RAG-QA-Assistant","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ffiruzi%2FWebsite-RAG-QA-Assistant","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ffiruzi%2FWebsite-RAG-QA-Assistant/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ffiruzi%2FWebsite-RAG-QA-Assistant/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ffiruzi%2FWebsite-RAG-QA-Assistant/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ffiruzi","download_url":"https://codeload.github.com/ffiruzi/Website-RAG-QA-Assistant/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ffiruzi%2FWebsite-RAG-QA-Assistant/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":269693405,"owners_count":24460234,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-10T02:00:08.965Z","response_time":71,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","chatbot","docker","fastapi","full-stack","langchain","openai","python","question-answering","rag","rag-chatbot","react","typescript","vector-database","web-crawling"],"created_at":"2025-08-10T07:40:13.989Z","updated_at":"2026-04-09T18:14:36.702Z","avatar_url":"https://github.com/ffiruzi.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🤖 Website RAG Q\u0026A System\n\n\u003e **Transform any website into an intelligent AI assistant with ONE Docker command**\n\n\u003cdiv align=\"center\"\u003e\n\n## 🎬 **LIVE DEMO - CLICK TO WATCH** 🎬\n\n[![Website RAG Demo](https://img.shields.io/badge/▶️%20WATCH%20FULL%20DEMO-FF0000?style=for-the-badge\u0026logo=youtube\u0026logoColor=white\u0026labelColor=282828\u0026color=FF0000\u0026logoWidth=30)](https://youtu.be/lnuF3FhVzbg)\n\n[![Demo Thumbnail](https://img.youtube.com/vi/lnuF3FhVzbg/maxresdefault.jpg)](https://youtu.be/lnuF3FhVzbg)\n\n### 🚀 **3-Minute Demo: Setup → Crawling → AI Chat**\n*Click the thumbnail above to see the complete system in action!*\n\n[![GitHub](https://img.shields.io/badge/⭐%20Star%20This%20Repo-181717?style=for-the-badge\u0026logo=github)](https://github.com/ffiruzi/Website-RAG-QA-Assistant)\n[![Video Views](https://img.shields.io/youtube/views/lnuF3FhVzbg?style=for-the-badge\u0026logo=youtube\u0026logoColor=white\u0026labelColor=FF0000\u0026color=FF0000)]()\n\n\u003c/div\u003e\n\n\n\n![License](https://img.shields.io/badge/license-MIT-blue.svg)\n![Python](https://img.shields.io/badge/python-3.11+-blue.svg)\n![React](https://img.shields.io/badge/react-18+-blue.svg)\n![FastAPI](https://img.shields.io/badge/FastAPI-0.100+-green.svg)\n![LangChain](https://img.shields.io/badge/LangChain-latest-orange.svg)\n![Docker](https://img.shields.io/badge/docker-ready-blue.svg)\n\nA **complete RAG (Retrieval-Augmented Generation) system** that crawls websites, processes content into searchable embeddings, and provides intelligent AI-powered answers through a beautiful chat interface. **Everything runs in a single Docker container** for maximum simplicity.\n\n---\n\n## ✨ **Key Features**\n\n### 🔍 **Intelligent Content Processing**\n- **Smart Web Crawling**: Automatically discovers and extracts content from any website\n- **Vector Embeddings**: Converts content into searchable vectors using OpenAI embeddings  \n- **FAISS Vector Database**: Lightning-fast similarity search across content\n- **Real-time Processing**: Watch your content being processed in real-time\n\n### 🤖 **Advanced AI Question Answering**  \n- **RAG Pipeline**: Retrieves relevant context and generates accurate, source-cited responses\n- **Conversational Memory**: Maintains context across multiple questions\n- **Source Attribution**: Always shows where answers come from\n- **Multi-Website Support**: Switch between different websites in one chat\n\n### 💼 **Professional Admin Interface**\n- **Real-time Dashboard**: Monitor crawling, processing, and usage\n- **Website Management**: Add, remove, and configure websites easily  \n- **Analytics**: Track popular queries and system performance\n- **Job Monitoring**: Watch crawling and embedding progress live\n\n### 💬 **Beautiful Chat Widget**\n- **Instant Responses**: AI-powered answers in seconds\n- **Mobile Responsive**: Works perfectly on all devices\n- **Multi-Website Chat**: Ask questions about different websites\n- **Source Citations**: See exactly where answers come from\n\n---\n\n## 🚀 **One-Command Setup** (Seriously!)\n\n### **Prerequisites**\n- **Docker installed** ([Get Docker](https://docs.docker.com/get-docker/))\n- **OpenAI API key** ([Get key](https://platform.openai.com/api-keys))\n\n### **Setup in 30 seconds:**\n\n```bash\n# 1. Clone the repository\ngit clone https://github.com/ffiruzi/Website-RAG-QA-Assistant.git\ncd Website-RAG-QA-Assistant\n\n# 2. Run the setup script\nchmod +x run.sh\n./run.sh\n```\n\n**That's it!** 🎉 \n\n- **Frontend + Backend**: http://localhost\n- **API Documentation**: http://localhost/docs  \n- **Health Check**: http://localhost/health\n\n---\n\n## 🏗️ **What Runs in the Container**\n\n```\n┌─────────────────────────────────────────┐\n│           Single Docker Container        │\n├─────────────────────────────────────────┤\n│  🌐 Nginx (Port 80)                    │\n│    ├── Serves React Frontend           │\n│    └── Proxies API calls to FastAPI    │\n├─────────────────────────────────────────┤\n│  ⚡ FastAPI Backend (Port 8000)        │\n│    ├── RAG Processing Engine           │\n│    ├── OpenAI Integration              │\n│    ├── FAISS Vector Database           │\n│    └── SQLite Database                 │\n├─────────────────────────────────────────┤\n│  🤖 Background Services                │\n│    ├── Web Crawler                     │\n│    ├── Content Processor               │\n│    └── Embedding Generator             │\n└─────────────────────────────────────────┘\n```\n\n---\n\n## 🛠️ **Full Technology Stack**\n\n### **Backend**\n- **FastAPI**: High-performance Python API framework\n- **LangChain**: Advanced RAG and LLM orchestration  \n- **FAISS**: Meta's efficient vector similarity search\n- **OpenAI API**: GPT models for embeddings and completions\n- **SQLAlchemy**: Database ORM with SQLite\n- **Trafilatura**: Advanced web content extraction\n\n### **Frontend**  \n- **React 18**: Modern reactive user interface\n- **TypeScript**: Type-safe development\n- **Tailwind CSS**: Beautiful, responsive styling\n- **React Query**: Efficient data fetching\n- **Lucide Icons**: Professional icon library\n\n### **Infrastructure**\n- **Docker**: Single-container deployment\n- **Nginx**: High-performance reverse proxy\n- **SQLite**: Embedded database (no setup required)\n- **Multi-stage build**: Optimized container size\n\n---\n\n## 📱 **How to Use**\n\n### **1. Add Your First Website**\n1. **Open**: http://localhost  \n2. **Click**: \"Add Website\" \n3. **Enter URL**: Like `https://docs.python.org` or any website\n4. **Click**: \"Create\"\n\n### **2. Process the Content**\n1. **Start Crawling**: Click \"Crawl\" button next to your website\n2. **Wait**: Watch the progress in real-time  \n3. **Process Embeddings**: Click \"Process\" to create searchable vectors\n4. **Done**: Your website is now AI-ready!\n\n### **3. Start Chatting**\n1. **Click**: Blue chat widget in bottom-right corner\n2. **Select**: Your website from dropdown\n3. **Ask**: \"What is this website about?\"\n4. **Get**: Intelligent AI response with source citations!\n\n---\n\n## 🎯 **Perfect For**\n\n### **🏢 Business Websites**\n- **Customer Support**: Instant answers to product questions\n- **Documentation**: Make technical docs conversational\n- **E-commerce**: Help customers find products\n\n### **📚 Educational Content**\n- **Course Materials**: Interactive learning assistance\n- **Research Papers**: Make academic content accessible  \n- **Training Resources**: Corporate onboarding and training\n\n### **🔧 Developer Tools**\n- **API Documentation**: Natural language API queries  \n- **Code Repositories**: Ask questions about codebases\n- **Technical Blogs**: Interactive programming tutorials\n\n### **📰 Content Sites**\n- **News Websites**: Query articles and archives\n- **Blogs**: Interactive content exploration\n- **Knowledge Bases**: Conversational information access\n\n---\n\n## 🚦 **Quick Demo**\n\nAfter setup, try these example questions:\n\n```\n🤖 \"What is this website about?\"\n🤖 \"Summarize the main topics covered\"  \n🤖 \"How do I install [technology mentioned]?\"\n🤖 \"What are the key features described?\"\n🤖 \"Find information about pricing\"\n```\n\n---\n\n## 📊 **Performance**\n\n### **Benchmarks**\n- **Setup Time**: ~30 seconds (first run ~3 minutes)\n- **Query Response**: 2-5 seconds for complex questions\n- **Crawling Speed**: 5-10 pages per minute  \n- **Container Size**: ~2GB (includes everything)\n- **Memory Usage**: ~512MB-1GB depending on content\n\n### **Scalability**\n- **Websites**: Support for multiple websites\n- **Content**: Handles thousands of pages per website\n- **Users**: Concurrent chat sessions supported\n- **Queries**: No practical limit on questions\n\n---\n\n## 🔧 **Configuration**\n\n### **Environment Variables** (in `.env`)\n```bash\n# Required\nOPENAI_API_KEY=sk-your-actual-key-here     # Get from OpenAI\nSECRET_KEY=your-secret-key                 # Any secure string\n\n# Optional  \nDEBUG=True                                 # Enable debug mode\nDATABASE_URL=sqlite:///./app.db           # Database location\n```\n\n### **Advanced Docker Options**\n```bash\n# Run with custom port\ndocker run -p 8080:80 website-rag-qa\n\n# Run with persistent data\ndocker run -v ./data:/app/data website-rag-qa\n\n# Run with custom environment\ndocker run --env-file custom.env website-rag-qa\n```\n\n---\n\n## 🐛 **Troubleshooting**\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cstrong\u003eCommon Issues \u0026 Quick Fixes\u003c/strong\u003e\u003c/summary\u003e\n\n### **Container won't start**\n```bash\n# Check Docker is running\ndocker ps\n\n# View container logs  \ndocker logs website-rag-qa\n\n# Restart container\ndocker restart website-rag-qa\n```\n\n### **Can't access the website**\n```bash\n# Check if port 80 is free\nsudo lsof -i :80\n\n# Try different port\ndocker run -p 8080:80 website-rag-qa\n# Then access: http://localhost:8080\n```\n\n### **OpenAI API errors**\n```bash\n# Verify API key\necho $OPENAI_API_KEY\n\n# Test API key\ncurl -H \"Authorization: Bearer $OPENAI_API_KEY\" \\\n  https://api.openai.com/v1/models\n```\n\n### **Database issues**\n```bash\n# Reset everything\ndocker rm -f website-rag-qa\nrm -rf data/\n./run.sh\n```\n\n\u003c/details\u003e\n\n---\n\n## 🔒 **Security \u0026 Privacy**\n\n- **Local Processing**: Everything runs on your machine\n- **No Data Sharing**: Your website content stays private\n- **OpenAI Integration**: Only sends text chunks for embedding/completion\n- **Secure Defaults**: No exposed services beyond port 80\n- **Input Validation**: Prevents malicious input injection\n\n---\n\n## 🤝 **Contributing**\n\nLove this project? Here's how to contribute:\n\n### **Quick Contributions**\n- ⭐ **Star the repository** \n- 🐛 **Report bugs** via GitHub Issues\n- 💡 **Suggest features** in Discussions\n- 📖 **Improve documentation**\n\n### **Code Contributions**\n1. **Fork** the repository\n2. **Create** feature branch: `git checkout -b feature/amazing-feature`  \n3. **Commit** changes: `git commit -m 'Add amazing feature'`\n4. **Push** branch: `git push origin feature/amazing-feature`\n5. **Open** Pull Request\n\n---\n\n## 🌟 **Why This Project Stands Out**\n\n### **🚀 Zero Configuration**\n- **One command setup**: No complex configuration files\n- **Everything included**: No external dependencies to install\n- **Works anywhere**: Runs on any system with Docker\n\n### **🧠 Production-Quality AI**\n- **Modern RAG architecture**: Uses latest techniques  \n- **Real source attribution**: Always shows where answers come from\n- **Conversational memory**: Understands follow-up questions\n- **Multiple websites**: Switch contexts seamlessly\n\n### **💼 Professional Interface**\n- **Admin dashboard**: Complete management interface\n- **Real-time monitoring**: Watch processes live\n- **Analytics**: Usage patterns and performance metrics\n- **Mobile responsive**: Works on all devices\n\n### **🔧 Developer Friendly**\n- **Clean architecture**: Well-organized, documented code\n- **Modern tech stack**: Current best practices\n- **Easy to extend**: Modular design for customization\n- **Docker ready**: One-command deployment\n\n---\n\n## 📄 **License**\n\nThis project is licensed under the **MIT License** - see the [LICENSE](LICENSE) file for details.\n\n---\n\n\u003cdiv align=\"center\"\u003e\n\n## ⭐ **Star this repository if you found it helpful!** ⭐\n\n### 🚀 **Ready to try it?**\n\n```bash\ngit clone https://github.com/ffiruzi/Website-RAG-QA-Assistant.git\ncd Website-RAG-QA-Assistant  \n./run.sh\n```\n\n**Your AI-powered website assistant will be running in under a minute!**\n\n---\n\n**Built with ❤️ by ffiruzi**\n\n[🚀 **Quick Start**](#-one-command-setup-seriously) | [📖 **How to Use**](#-how-to-use) | [🤝 **Contribute**](#-contributing)\n\n\u003c/div\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fffiruzi%2Fwebsite-rag-qa-assistant","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fffiruzi%2Fwebsite-rag-qa-assistant","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fffiruzi%2Fwebsite-rag-qa-assistant/lists"}