{"id":30041743,"url":"https://github.com/prabal-verma/agentic-rag","last_synced_at":"2026-04-18T14:04:35.988Z","repository":{"id":302732899,"uuid":"1013446007","full_name":"Prabal-verma/Agentic-Rag","owner":"Prabal-verma","description":"An Agentic RAG (Retrieval-Augmented Generation) system powered by LangChain, enabling multi-step reasoning over documents using LLMs, ChromaDB, and Google Drive as a document source.","archived":false,"fork":false,"pushed_at":"2025-07-03T23:29:07.000Z","size":403,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-08-07T02:58:23.379Z","etag":null,"topics":["agentic-rag","chromadb","reactjs"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Prabal-verma.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-07-03T23:21:21.000Z","updated_at":"2025-07-03T23:31:46.000Z","dependencies_parsed_at":"2025-07-04T00:36:32.047Z","dependency_job_id":null,"html_url":"https://github.com/Prabal-verma/Agentic-Rag","commit_stats":null,"previous_names":["prabal-verma/agentic-rag"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Prabal-verma/Agentic-Rag","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Prabal-verma%2FAgentic-Rag","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Prabal-verma%2FAgentic-Rag/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Prabal-verma%2FAgentic-Rag/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Prabal-verma%2FAgentic-Rag/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Prabal-verma","download_url":"https://codeload.github.com/Prabal-verma/Agentic-Rag/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Prabal-verma%2FAgentic-Rag/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31971500,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-18T00:39:45.007Z","status":"online","status_checked_at":"2026-04-18T02:00:07.018Z","response_time":103,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agentic-rag","chromadb","reactjs"],"created_at":"2025-08-07T02:56:19.924Z","updated_at":"2026-04-18T14:04:35.946Z","avatar_url":"https://github.com/Prabal-verma.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🤖 Agentic RAG System\n![Alt Text](image.png)\n\nA comprehensive Retrieval-Augmented Generation (RAG) system that combines voice input, multimodal document processing, and intelligent search capabilities across multiple sources. Built with FastAPI and Next.js, this system provides real-time AI-powered question answering with visual grounding and citation support.\n\n## 🌟 Features\n\n### 🎤 Voice Integration\n- **Real-time Speech-to-Text**: Streaming voice input using Google Cloud Speech-to-Text\n- **WebSocket-based Audio Processing**: Low-latency voice recognition with partial results\n- **Multi-language Support**: Configurable language detection and transcription\n- **Voice-enabled Chat Interface**: Natural conversation flow with voice commands\n\n### 📚 Multimodal Document Processing\n- **Advanced PDF Processing**: Extract text, images, charts, and tables from PDFs\n- **Image Understanding**: AI-powered analysis of charts, diagrams, and visual content\n- **OCR Integration**: Text extraction from scanned documents and images\n- **Smart Chunking**: Intelligent text segmentation with context preservation\n- **Visual Grounding**: Link answers to specific document images and pages\n\n### 🔍 Intelligent Multi-Source Search\n- **Local RAG System**: Vector-based document retrieval with ChromaDB\n- **Web Search Integration**: Real-time web search via SERP API\n- **Google Drive MCP**: Model Context Protocol integration for Drive documents\n- **Parallel Search Execution**: Simultaneous queries across all sources\n- **Smart Result Fusion**: Intelligent combination of results from multiple sources\n\n### 📍 Citations \u0026 Transparency\n- **Comprehensive Citations**: Detailed source attribution for every answer\n- **Visual Citations**: Click-through access to source images and documents\n- **Confidence Scoring**: Reliability indicators for each source\n- **Source Traceability**: Full audit trail of information sources\n- **Interactive Content Viewer**: In-app display of PDFs, images, and web content\n\n### ⚡ Real-time Capabilities\n- **WebSocket Communication**: Real-time chat and voice processing\n- **Streaming Responses**: Progressive answer generation\n- **Live Transcription**: Real-time speech-to-text with partial results\n- **Concurrent Processing**: Parallel execution of search and generation tasks\n\n## 🏗️ Architecture\n\n```\n┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐\n│   Frontend      │    │    Backend       │    │   External      │\n│   (Next.js)     │    │   (FastAPI)      │    │   Services      │\n├─────────────────┤    ├──────────────────┤    ├─────────────────┤\n│ • Voice Input   │◄──►│ • STT Service    │    │ • Gemini/Claude │\n│ • Chat UI       │    │ • RAG Engine     │◄──►│ • Google Drive  │\n│ • Citations     │    │ • Web Search     │    │ • SERP API      │\n│ • Image Display │    │ • Document Proc. │    │ • ChromaDB      │\n└─────────────────┘    └──────────────────┘    └─────────────────┘\n```\n\n## 🛠️ Technology Stack\n\n### Backend\n- **Framework**: FastAPI 0.104+ with async/await support\n- **AI Providers**: Google Gemini 1.5 Pro, Anthropic Claude 3 Sonnet\n- **Vector Database**: ChromaDB for embedding storage and retrieval\n- **Speech Processing**: Google Cloud Speech-to-Text API\n- **Document Processing**: PyPDF2, Pillow, pytesseract for OCR\n- **Search Integration**: SERP API for web search, Google Drive API\n- **WebSocket**: Real-time communication with connection management\n- **Authentication**: OAuth 2.0 for Google services\n\n### Frontend\n- **Framework**: Next.js 14 with App Router\n- **Language**: TypeScript for type safety\n- **UI Library**: React 18 with Tailwind CSS\n- **State Management**: Zustand for client state\n- **Audio Processing**: Web Audio API with WebRTC\n- **Real-time**: WebSocket client with auto-reconnection\n- **Testing**: Jest and React Testing Library\n\n### AI \u0026 ML\n- **Embeddings**: sentence-transformers/all-MiniLM-L6-v2\n- **Vector Search**: Similarity search with configurable thresholds\n- **Multimodal AI**: Vision models for image understanding\n- **Text Generation**: Context-aware response generation\n- **Confidence Scoring**: Relevance and reliability metrics\n\n## 🚀 Installation \u0026 Setup\n\n### Prerequisites\n\n- **Python 3.9+** (3.11 recommended)\n- **Node.js 18+** with npm/yarn\n- **Google Cloud Account** (for Speech-to-Text)\n- **AI Provider Account** (Gemini or Claude)\n\n### 1. Clone Repository\n\n```bash\ngit clone \u003crepository-url\u003e\ncd agnt\n```\n\n### 2. Backend Setup\n\n#### Install Dependencies\n```bash\ncd backend\npip install -r requirements.txt\n```\n\n#### Configure Environment\n```bash\ncp env.example .env\n# Edit .env with your configuration (see Configuration section)\n```\n\n#### Set up Google Cloud Service Account\n1. Create a project in [Google Cloud Console](https://console.cloud.google.com)\n2. Enable Speech-to-Text API and Drive API\n3. Create a service account and download the JSON key\n4. Set `GOOGLE_CLOUD_SERVICE_ACCOUNT_PATH` in your `.env` file\n\n#### Initialize Database\n```bash\n# ChromaDB will be initialized automatically on first run\n# Data will be stored in ./chroma_db/ directory\n```\n\n### 3. Frontend Setup\n\n#### Install Dependencies\n```bash\ncd frontend\nnpm install\n# or\nyarn install\n```\n\n#### Configure Environment\n```bash\n# Create .env.local file\necho \"NEXT_PUBLIC_API_URL=http://localhost:8000\" \u003e .env.local\necho \"NEXT_PUBLIC_WS_URL=ws://localhost:8000\" \u003e\u003e .env.local\n```\n\n### 4. Running the Application\n\n#### Start Backend Server\n```bash\ncd backend\nuvicorn main:app --host 0.0.0.0 --port 8000 --reload\n```\n\n#### Start Frontend Development Server\n```bash\ncd frontend\nnpm run dev\n# or\nyarn dev\n```\n\n#### Access the Application\n- **Frontend**: http://localhost:3000\n- **Backend API Docs**: http://localhost:8000/docs\n- **Backend Health Check**: http://localhost:8000/health\n\n## ⚙️ Configuration\n\n### Environment Variables\n\n#### Core AI Configuration\n```bash\n# Choose your AI provider\nAI_PROVIDER=claude  # or \"gemini\"\n\n# API Keys (get one based on your provider choice)\nCLAUDE_API_KEY=your_claude_api_key\nGEMINI_API_KEY=your_gemini_api_key\n```\n\n#### Speech-to-Text Setup\n```bash\n# Google Cloud Speech-to-Text (required for voice features)\nGOOGLE_CLOUD_SERVICE_ACCOUNT_PATH=/path/to/service-account.json\nSTT_PROVIDER=google\nGOOGLE_SPEECH_MODEL=latest_long\n```\n\n#### Search Integration (Optional)\n```bash\n# Web Search (choose one)\nSERP_API_KEY=your_serp_api_key          # Recommended\nGOOGLE_API_KEY=your_google_api_key      # Alternative\n\n# Google Drive Integration\nGOOGLE_DRIVE_CLIENT_ID=your_client_id\nGOOGLE_DRIVE_CLIENT_SECRET=your_client_secret\n```\n\n#### Advanced Settings\n```bash\n# Vector Database\nCHROMA_PERSIST_DIRECTORY=./chroma_db\nEMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2\n\n# Performance Tuning\nMAX_SEARCH_RESULTS=10\nSIMILARITY_THRESHOLD=0.7\nMAX_TOKENS_PER_CHUNK=1000\nCHUNK_OVERLAP=200\nMAX_CONCURRENT_REQUESTS=100\n\n# Security\nCORS_ORIGINS=http://localhost:3000,http://127.0.0.1:3000\nSECRET_KEY=your-secret-key-change-in-production\nRATE_LIMIT_PER_MINUTE=60\n\n# Feature Flags\nENABLE_WEB_SEARCH=true\nENABLE_GOOGLE_DRIVE=true\nENABLE_VOICE_INPUT=true\nENABLE_IMAGE_ANALYSIS=true\n```\n\n### Model Configuration\n\n#### AI Provider Selection\n- **Claude**: Better reasoning, more conservative responses\n- **Gemini**: Faster processing, better multimodal understanding\n\n#### Model Choices\n```bash\n# Gemini Models\nGEMINI_CHAT_MODEL=gemini-1.5-pro\nGEMINI_VISION_MODEL=gemini-1.5-pro-vision\n\n# Claude Models  \nCLAUDE_CHAT_MODEL=claude-3-sonnet-20240229\nCLAUDE_VISION_MODEL=claude-3-sonnet-20240229\n```\n\n## 📖 API Documentation\n\n### Core Endpoints\n\n#### Health Check\n```http\nGET /health\n```\nReturns system status and service availability.\n\n#### Document Upload\n```http\nPOST /upload\nContent-Type: multipart/form-data\n\nfile: \u003cPDF file\u003e\n```\nUpload and process a PDF document with image extraction.\n\n**Response:**\n```json\n{\n  \"success\": true,\n  \"document_id\": \"uuid\",\n  \"filename\": \"document.pdf\",\n  \"pages_processed\": 10,\n  \"images_extracted\": 5,\n  \"text_chunks\": 25,\n  \"processing_time_ms\": 1500\n}\n```\n\n#### Query System\n```http\nPOST /query\nContent-Type: application/json\n\n{\n  \"query\": \"What is the main conclusion of the research?\",\n  \"num_results\": 5,\n  \"include_web_search\": true,\n  \"include_drive_search\": true\n}\n```\n\n**Response:**\n```json\n{\n  \"answer\": \"Based on the research findings...\",\n  \"citations\": [\n    {\n      \"id\": \"cite_1\",\n      \"source_type\": \"document\",\n      \"citation_type\": \"text\",\n      \"title\": \"Research Paper.pdf\",\n      \"content\": \"The main conclusion shows...\",\n      \"page_number\": 15,\n      \"confidence_score\": 0.95\n    }\n  ],\n  \"confidence_score\": 0.87,\n  \"processing_time_ms\": 2300\n}\n```\n\n#### Citation Details\n```http\nGET /citation/{citation_id}\n```\nRetrieve full content and metadata for a specific citation.\n\n### WebSocket Endpoints\n\n#### Speech-to-Text\n```javascript\nconst ws = new WebSocket('ws://localhost:8000/ws/stt');\n\n// Send audio data\nws.send(audioBuffer);\n\n// Receive transcription\nws.onmessage = (event) =\u003e {\n  const result = JSON.parse(event.data);\n  console.log(result.text, result.confidence);\n};\n```\n\n#### Real-time Chat\n```javascript\nconst ws = new WebSocket('ws://localhost:8000/ws/chat');\n\n// Send message\nws.send(JSON.stringify({\n  type: 'query',\n  message: 'Hello, how can you help me?',\n  session_id: 'session_123'\n}));\n```\n\n### Frontend API Client\n\nThe frontend includes a comprehensive API client in `src/lib/api.ts`:\n\n```typescript\nimport { QueryRequest, QueryResponse } from '@/types/api';\n\n// Query the system\nconst response = await api.query({\n  query: 'What is machine learning?',\n  num_results: 5\n});\n\n// Upload document\nconst result = await api.uploadDocument(file);\n\n// Get citation details\nconst citation = await api.getCitation(citationId);\n```\n\n## 🎯 Usage Examples\n\n### Basic Text Query\n\n1. Open the application at http://localhost:3000\n2. Type your question in the chat input\n3. View the AI-generated response with citations\n4. Click citations to view source content\n\n### Voice Query\n\n1. Click the microphone icon in the chat interface\n2. Speak your question clearly\n3. Watch real-time transcription appear\n4. Release to send the query\n5. Receive voice-enabled response\n\n### Document Upload \u0026 Analysis\n\n1. Click the upload button or drag files into the interface\n2. Select a PDF document (with images/charts)\n3. Wait for processing to complete\n4. Ask questions about the document content\n5. View responses with page-specific citations\n\n### Advanced Search Features\n\n```bash\n# Query with specific filters\ncurl -X POST \"http://localhost:8000/query\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"query\": \"quarterly revenue trends\",\n    \"num_results\": 10,\n    \"include_web_search\": true,\n    \"filters\": {\n      \"date_range\": \"2023-2024\",\n      \"document_type\": \"financial\"\n    }\n  }'\n```\n\n## 🧪 Development\n\n### Running Tests\n\n#### Backend Tests\n```bash\ncd backend\npytest -v\npytest --cov=app tests/  # With coverage\n```\n\n#### Frontend Tests\n```bash\ncd frontend\nnpm test\nnpm run test:watch  # Watch mode\n```\n\n### Code Quality\n\n#### Linting \u0026 Formatting\n```bash\n# Backend\ncd backend\nblack .\nflake8 .\n\n# Frontend\ncd frontend\nnpm run lint\nnpm run type-check\n```\n\n### Development Workflow\n\n1. **Feature Development**\n   - Create feature branch from `main`\n   - Add tests for new functionality\n   - Update documentation as needed\n\n2. **Testing**\n   - Run full test suite\n   - Test with different AI providers\n   - Verify WebSocket functionality\n\n3. **Code Review**\n   - Check API compatibility\n   - Verify error handling\n   - Test edge cases\n\n### Project Structure\n\n```\nagnt/\n├── backend/                 # FastAPI backend\n│   ├── app/\n│   │   ├── config.py       # Configuration management\n│   │   ├── models/         # Pydantic schemas\n│   │   ├── services/       # Business logic\n│   │   └── websocket/      # WebSocket handlers\n│   ├── main.py            # FastAPI application\n│   └── requirements.txt   # Python dependencies\n├── frontend/              # Next.js frontend\n│   ├── src/\n│   │   ├── app/          # App router pages\n│   │   ├── components/   # React components\n│   │   ├── lib/         # Utilities and API client\n│   │   └── store/       # State management\n│   └── package.json     # Node.js dependencies\n└── README.md           # This file\n```\n\n## 🐛 Troubleshooting\n\n### Common Issues\n\n#### 1. Speech-to-Text Not Working\n```bash\n# Check Google Cloud credentials\nexport GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json\n\n# Verify API is enabled\ngcloud services list --enabled | grep speech\n\n# Test authentication\npython -c \"from google.cloud import speech; print('Auth OK')\"\n```\n\n#### 2. Vector Database Issues\n```bash\n# Reset ChromaDB\nrm -rf backend/chroma_db/\n# Restart backend to reinitialize\n```\n\n#### 3. CORS Errors\n```bash\n# Update CORS_ORIGINS in .env\nCORS_ORIGINS=http://localhost:3000,http://127.0.0.1:3000\n```\n\n#### 4. WebSocket Connection Failed\n```bash\n# Check firewall settings\n# Verify WebSocket URL in frontend config\n# Check backend logs for connection errors\n```\n\n### Performance Optimization\n\n#### 1. Slow Document Processing\n- Reduce image resolution in processing\n- Increase `MAX_CONCURRENT_REQUESTS`\n- Use SSD storage for database\n\n#### 2. High Memory Usage\n- Adjust `EMBEDDING_BATCH_SIZE`\n- Limit `MAX_TOKENS_PER_CHUNK`\n- Monitor vector database size\n\n#### 3. API Response Times\n- Enable caching with Redis\n- Optimize similarity threshold\n- Use parallel search execution\n\n### Debugging\n\n#### Enable Debug Logging\n```bash\n# Backend\nLOG_LEVEL=DEBUG\n\n# Frontend  \nNEXT_PUBLIC_DEBUG=true\n```\n\n#### Monitor System Health\n```bash\n# Check service status\ncurl http://localhost:8000/health\n\n# View logs\ntail -f backend/app.log\n```\n\n## 🚀 Deployment\n\n### Production Deployment\n\n#### Docker Deployment\n```bash\n# Build and run with Docker\ndocker-compose up --build -d\n```\n\n#### Environment Setup\n```bash\n# Production environment variables\nDEBUG=false\nRELOAD=false\nLOG_LEVEL=INFO\nCORS_ORIGINS=https://yourdomain.com\n```\n\n#### Performance Considerations\n- Use PostgreSQL for metadata storage\n- Implement Redis for caching\n- Set up load balancing for multiple instances\n- Configure CDN for static assets\n\n### Security Checklist\n\n- [ ] Change default SECRET_KEY\n- [ ] Enable HTTPS in production\n- [ ] Implement rate limiting\n- [ ] Secure API endpoints\n- [ ] Validate file uploads\n- [ ] Monitor for suspicious activity\n\n## 🤝 Contributing\n\n### Development Setup\n\n1. Fork the repository\n2. Create a feature branch\n3. Install development dependencies\n4. Run tests to ensure everything works\n5. Make your changes\n6. Add tests for new functionality\n7. Submit a pull request\n\n### Code Standards\n\n- **Python**: Follow PEP 8, use type hints\n- **TypeScript**: Use strict mode, proper interfaces\n- **Documentation**: Update README for new features\n- **Testing**: Maintain test coverage above 80%\n\n### Reporting Issues\n\n1. Check existing issues first\n2. Provide detailed reproduction steps\n3. Include system information\n4. Add relevant logs and error messages\n\n## 📜 License\n\nMIT License - see [LICENSE](LICENSE) file for details.\n\n## 🙋‍♂️ Support\n\n### Getting Help\n\n- **Documentation**: Check this README and API docs\n- **Issues**: Create GitHub issue for bugs\n- **Discussions**: Use GitHub Discussions for questions","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprabal-verma%2Fagentic-rag","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fprabal-verma%2Fagentic-rag","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprabal-verma%2Fagentic-rag/lists"}