{"id":31644536,"url":"https://github.com/kingakeem/personal-rag","last_synced_at":"2026-05-05T19:32:39.601Z","repository":{"id":316481885,"uuid":"1063326011","full_name":"KingAkeem/personal-rag","owner":"KingAkeem","description":"Personal Retrieval-Augmented Generation (RAG) app with Gradio UI, Elasticsearch vector DB, and Ollama LLM/embeddings","archived":false,"fork":false,"pushed_at":"2025-10-02T22:59:49.000Z","size":31,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-10-03T00:28:35.292Z","etag":null,"topics":["ai","chat-with-document","chat-with-your-data","elasticsearch","embeddings","gradio","llm","ml","nlp","ollama","rag","retrieval-augmented-generation","semantic-search","vector-search"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/KingAkeem.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-09-24T13:24:45.000Z","updated_at":"2025-10-02T22:59:52.000Z","dependencies_parsed_at":"2025-09-26T00:30:31.779Z","dependency_job_id":null,"html_url":"https://github.com/KingAkeem/personal-rag","commit_stats":null,"previous_names":["kingakeem/personal-rag"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/KingAkeem/personal-rag","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KingAkeem%2Fpersonal-rag","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KingAkeem%2Fpersonal-rag/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KingAkeem%2Fpersonal-rag/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KingAkeem%2Fpersonal-rag/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/KingAkeem","download_url":"https://codeload.github.com/KingAkeem/personal-rag/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/KingAkeem%2Fpersonal-rag/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278722768,"owners_count":26034461,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-07T02:00:06.786Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","chat-with-document","chat-with-your-data","elasticsearch","embeddings","gradio","llm","ml","nlp","ollama","rag","retrieval-augmented-generation","semantic-search","vector-search"],"created_at":"2025-10-07T04:53:46.414Z","updated_at":"2025-10-07T04:53:48.278Z","avatar_url":"https://github.com/KingAkeem.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 📚 Personal RAG Assistant\n\nA powerful, self-hosted Retrieval-Augmented Generation (RAG) application that lets you chat with your personal documents using local AI. Built with GPU support for both AMD and NVIDIA systems.\n\n![Gradio Interface](https://img.shields.io/badge/Interface-Gradio-FF4B4B?style=for-the-badge)\n![Elasticsearch](https://img.shields.io/badge/Vector%20DB-Elasticsearch-005571?style=for-the-badge)\n![Ollama](https://img.shields.io/badge/LLM-Ollama-3D5AFE?style=for-the-badge)\n![Multi-GPU](https://img.shields.io/badge/GPU-AMD%2FNVIDIA%20Support-00C853?style=for-the-badge)\n\n## ✨ Features\n\n- **🔍 Document Intelligence**: Upload and chat with your TXT files and Markdown documents\n- **💬 Smart Conversations**: Context-aware chat using Retrieval-Augmented Generation (RAG)\n- **⚡ Local \u0026 Private**: Everything runs on your machine - no data leaves your system\n- **🎯 Multi-GPU Support**: Optimized for both AMD (ROCm) and NVIDIA (CUDA) GPUs\n- **📊 Document Search**: Direct semantic search through your uploaded documents\n- **🔧 Easy Setup**: Docker-based deployment with auto-detection for your hardware\n- **📈 Real-time Streaming**: Watch responses generate token-by-token\n- **🔍 Vector Search**: Semantic search powered by Elasticsearch's dense vector capabilities\n\n## 🏗️ Architecture\n\n```mermaid\ngraph TB\n    A[User Interface] --\u003e B[Gradio Web App]\n    B --\u003e C[Elasticsearch Vector DB]\n    B --\u003e D[Ollama LLM]\n    C --\u003e E[Document Storage]\n    D --\u003e F[AMD/NVIDIA GPU]\n    \n    subgraph \"RAG Pipeline\"\n        G[Document Upload] --\u003e H[Text Chunking]\n        H --\u003e I[Vector Embeddings]\n        I --\u003e J[Vector Storage]\n        K[User Query] --\u003e L[Semantic Search]\n        L --\u003e M[Context Augmentation]\n        M --\u003e N[LLM Generation]\n    end\n    \n    subgraph \"AI Backend\"\n        C\n        D\n    end\n```\n\n## 📋 Prerequisites\n\n- **Docker** and **Docker Compose**\n- **GPU** (Optional but recommended):\n  - AMD GPU with ROCm support (RX 6000+ series recommended)\n  - NVIDIA GPU with CUDA support (GTX 10-series+ recommended)\n  - CPU-only mode also supported\n\n## 🚀 Quick Start\n\n### 1. Clone the Repository\n```bash\ngit clone https://github.com/KingAkeem/personal-rag.git\ncd personal-rag\n```\n\n### 2. Auto-Detect Setup (Recommended)\n```bash\n# The script automatically detects your GPU and configures accordingly\n./scripts/start.sh\n```\n\n### 3. Manual Setup (If you need specific control)\n```bash\n# For AMD GPUs\n./scripts/start.sh amd\n\n# For NVIDIA GPUs\n./scripts/start.sh nvidia\n```\n\n### 4. Access the Application\n- **Main App**: http://localhost:7860\n- **Elasticsearch**: http://localhost:9200\n- **Kibana** (Monitoring): http://localhost:5601\n- **Ollama API**: http://localhost:11434\n\n## 📁 Project Structure\n\n```\npersonal-rag-assistant/\n├── src/main.py                 # Main Gradio web interface\n├── src/embeddings           # Text embedding utilities (nomic-embed-text)\n├── src/storage              # Vector database operations (Elasticsearch, etc.)\n├── src/llm                  # LLM chat and RAG functionality (llama2:7b)\n├── docker-compose.amd.yml  # AMD GPU configuration\n├── docker-compose.nvidia.yml # NVIDIA GPU configuration\n├── scripts/start.sh                # Auto-detecting startup script\n├── scripts/stop.sh                 # Stop all services\n├── scripts/setup-elasticsearch.sh  # Elasticsearch initialization\n└── scripts/install-rocm.sh  # Setup ROCM for AMD GPUs locally\n```\n\n## 🔧 Core Components\n\n### Main Application (`main.py`)\n- Gradio-based web interface with three tabs: Chat, Upload Documents, Document Search\n- Real-time streaming responses\n- Configurable context chunks (1-5)\n- File upload support for .txt, .pdf, .md files\n\n### Vector Storage (`elasticsearch`)\n- Elasticsearch 8.13.0 with vector search capabilities\n- Automatic text chunking with configurable overlap\n- Cosine similarity search for semantic retrieval\n- Document indexing and management\n\n### LLM Integration (`llm`)\n- Ollama integration with streaming support\n- RAG pipeline with context augmentation\n- Configurable chat models (default: llama2:7b)\n\n### Embeddings (`embeddings`)\n- Local embedding generation using nomic-embed-text\n- 768-dimensional vector embeddings\n- Error handling and fallback mechanisms\n\n## 💻 Usage\n\n### 1. Upload Documents\n- Go to the \"Upload Documents\" tab\n- Upload your TXT or Markdown files\n- Documents are automatically chunked and indexed for semantic search\n\n### 2. Chat with Your Documents\n- Switch to the \"Chat\" tab\n- Ask questions about your uploaded content\n- Adjust the \"Context chunks\" slider (1-5) to control how much context is used\n- Watch responses stream in real-time\n\n### 3. Search Documents\n- Use the \"Document Search\" tab for direct semantic search\n- Find relevant passages with similarity scores\n- View results in JSON format with filename and content\n\n## ⚙️ Configuration\n\n### Environment Variables\nThe app can be configured using environment variables in the Docker Compose files:\n\n```yaml\n# Elasticsearch Configuration\nELASTICSEARCH_URL: \"http://elasticsearch:9200\"\nELASTICSEARCH_USERNAME: \"elastic\"\nELASTICSEARCH_PASSWORD: \"changeme\"\n\n# Ollama Configuration\nOLLAMA_HOST: \"http://ollama:11434\"\n\n# Model Configuration (in respective Python files)\nCHAT_MODEL: \"llama2:7b\"\nEMBEDDING_MODEL: \"nomic-embed-text\"\n```\n\n### Model Customization\nTo use different models, modify the environment variables or directly edit the Python files:\n\n```python\n# In llm.py\nCHAT_MODEL = os.getenv(\"CHAT_MODEL\", \"mistral:7b\")  # Change default model\n\n# In embeddings.py  \nEMBEDDING_MODEL = os.getenv('EMBEDDING_MODEL', \"all-minilm:l6-v2\")  # Change embedding model\n```\n\n## 🐛 Troubleshooting\n\n### Common Issues\n\n**GPU Not Detected**\n```bash\n# Check GPU detection\n./scripts/start.sh --debug\n\n# Force CPU mode\n./scripts/start.sh amd  # Uses CPU-only fallback\n```\n\n**Ollama Model Fails to Load**\n```bash\n# Check available models\ndocker exec ollama ollama list\n\n# Pull model manually\ndocker exec ollama ollama pull llama2:7b\n```\n\n**Elasticsearch Health Issues**\n```bash\n# Check Elasticsearch status\ncurl -u elastic:changeme http://localhost:9200/_cluster/health\n\n# View Elasticsearch logs\ndocker logs elasticsearch -f\n```\n\n**Port Conflicts**\n```bash\n# Check what's using the ports\nsudo lsof -i :7860  # Gradio app\nsudo lsof -i :9200  # Elasticsearch\nsudo lsof -i :5601  # Kibana\nsudo lsof -i :11434 # Ollama\n```\n\n### Logs and Monitoring\n\n```bash\n# View all service logs\ndocker compose -f docker-compose.amd.yml logs -f\n\n# View specific service logs\ndocker logs rag-app -f\ndocker logs ollama -f\ndocker logs elasticsearch -f\n\n# Check service health\ndocker ps\ndocker stats\n```\n\n## 🔒 Security Notes\n\n- Default passwords are set to `changeme` - **change these in production**\n- Elasticsearch security is enabled by default\n- The application runs locally by default (server_name=\"0.0.0.0\")\n- Consider using HTTPS and reverse proxy for external access\n- Regularly update Docker images to latest versions\n\n## 🚀 Performance Tips\n\n### For Better GPU Utilization\n- Adjust `HSA_OVERRIDE_GFX_VERSION` in AMD configuration for your specific GPU\n- Modify `OLLAMA_GPU_LAYERS` in NVIDIA configuration based on VRAM\n- Monitor GPU usage with `rocm-smi` (AMD) or `nvidia-smi` (NVIDIA)\n\n### For Large Document Collections\n- Increase Elasticsearch heap size in `ES_JAVA_OPTS`\n- Adjust chunk size and overlap in `storage.py`\n- Monitor disk space for vector storage\n\n### Areas for Contribution\n- Additional file format support (DOCX)\n- Enhanced UI/UX improvements\n- More embedding model options\n- Performance optimizations\n- Additional vector database support\n\n## 🙏 Acknowledgments\n\n- [Gradio](https://gradio.app/) for the excellent web interface framework\n- [Ollama](https://ollama.ai/) for making local LLMs accessible\n- [Elasticsearch](https://elastic.co/) for vector search capabilities\n- The open-source AI community for continuous inspiration\n\n---\n\n**⭐ If this project helped you, please give it a star on GitHub!**\n\n---","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkingakeem%2Fpersonal-rag","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkingakeem%2Fpersonal-rag","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkingakeem%2Fpersonal-rag/lists"}