{"id":31698833,"url":"https://github.com/yashdew3/generic-data-rag-agent","last_synced_at":"2026-04-08T23:34:49.639Z","repository":{"id":318134538,"uuid":"1070096301","full_name":"yashdew3/generic-data-rag-agent","owner":"yashdew3","description":"AI-powered RAG agent to upload CSV, Excel, PDF \u0026 chat with your data using FastAPI, ChromaDB, and Google Gemini API.","archived":false,"fork":false,"pushed_at":"2025-10-05T09:39:44.000Z","size":177,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-10-12T17:26:47.379Z","etag":null,"topics":["ai-agents","chromadb","csv","embeddings","fastapi","gemini-api","generative-ai","langchain","pdf-excel","rag","react","sentence-transformers","tailwind"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/yashdew3.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-10-05T09:01:49.000Z","updated_at":"2025-10-05T09:43:16.000Z","dependencies_parsed_at":"2025-10-05T11:33:36.419Z","dependency_job_id":"37fa8719-3cca-426a-8e97-b26151b26db4","html_url":"https://github.com/yashdew3/generic-data-rag-agent","commit_stats":null,"previous_names":["yashdew3/generic-data-rag-agent"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/yashdew3/generic-data-rag-agent","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yashdew3%2Fgeneric-data-rag-agent","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yashdew3%2Fgeneric-data-rag-agent/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yashdew3%2Fgeneric-data-rag-agent/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yashdew3%2Fgeneric-data-rag-agent/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/yashdew3","download_url":"https://codeload.github.com/yashdew3/generic-data-rag-agent/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yashdew3%2Fgeneric-data-rag-agent/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31579056,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-08T14:31:17.711Z","status":"ssl_error","status_checked_at":"2026-04-08T14:31:17.202Z","response_time":54,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-agents","chromadb","csv","embeddings","fastapi","gemini-api","generative-ai","langchain","pdf-excel","rag","react","sentence-transformers","tailwind"],"created_at":"2025-10-08T19:11:08.045Z","updated_at":"2026-04-08T23:34:49.620Z","avatar_url":"https://github.com/yashdew3.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Generic Data RAG Agent 🤖📊\n\n[![FastAPI](https://img.shields.io/badge/FastAPI-0.95+-green.svg)](https://fastapi.tiangolo.com/)\n[![React](https://img.shields.io/badge/React-18.2+-blue.svg)](https://reactjs.org/)\n[![Python](https://img.shields.io/badge/Python-3.8+-blue.svg)](https://www.python.org/)\n[![License](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\nA powerful **Retrieval-Augmented Generation (RAG)** system that allows users to upload various data formats and interact with them through natural language queries. Built with modern technologies and designed for scalability and ease of use.\n\n![Dashboard](frontend/Dashboard.png)\n\n## 🚀 Features\n\n- **📄 Multi-Format Support**: Upload and process CSV, Excel, PDF, and text files\n- **🧠 Intelligent Retrieval**: Uses sentence transformers for semantic search\n- **💬 Natural Language Chat**: Query your data using conversational AI powered by Google Gemini\n- **📊 Vector Database**: ChromaDB for efficient similarity search and retrieval\n- **🔄 Real-time Processing**: Instant file processing and indexing\n- **📈 Chat History**: Persistent conversation history with context awareness\n- **🎨 Modern UI**: Clean, responsive interface built with React and Tailwind CSS\n- **⚡ Fast API**: High-performance backend with FastAPI and async processing\n\n## 🏗️ Architecture\n\n```\n┌────────────────────┐    ┌─────────────────┐    ┌─────────────────┐\n│   React Frontend   │────│   FastAPI       │────│   ChromaDB      │\n│   (Vite + Tailwind)│    │   Backend       │    │   Vector Store  │\n└────────────────────┘    └─────────────────┘    └─────────────────┘\n                                  │\n                          ┌─────────────────┐\n                          │   Google Gemini │\n                          │   AI Model      │\n                          └─────────────────┘\n```\n\n### Core Components\n\n- **Frontend**: React 18 with Vite, Tailwind CSS, and Lucide React icons\n- **Backend**: FastAPI with async support, CORS middleware, and structured routing\n- **AI Model**: Google Gemini 2.5 Flash for natural language processing\n- **Embeddings**: Sentence Transformers for semantic understanding\n- **Vector Database**: ChromaDB for efficient similarity search\n- **File Processing**: Support for multiple formats with automatic text extraction\n\n## 🛠️ Tech Stack\n\n### Backend\n- **FastAPI** - Modern, fast web framework for building APIs\n- **Google Generative AI** - Gemini 2.5 Flash model integration  \n- **ChromaDB** - Vector database for embeddings and similarity search\n- **Sentence Transformers** - State-of-the-art sentence embeddings\n- **Pandas** - Data manipulation and analysis\n- **PDFPlumber** - PDF text extraction\n- **OpenPyXL** - Excel file processing\n\n### Frontend  \n- **React 18** - Modern React with hooks and functional components\n- **Vite** - Fast build tool and development server\n- **Tailwind CSS** - Utility-first CSS framework\n- **Lucide React** - Beautiful, customizable icons\n\n## 🗂️ Project Structure\n\n```\ngeneric-data-rag-agent/\n├── backend/\n│   ├── app/\n│   │   ├── core/\n│   │   │   └── config.py          # Configuration settings\n│   │   ├── routers/\n│   │   │   ├── chat.py            # Chat endpoints\n│   │   │   ├── files.py           # File management endpoints  \n│   │   │   └── history.py         # History endpoints\n│   │   ├── services/\n│   │   │   ├── indexer.py         # Document indexing\n│   │   │   ├── ingestion.py       # File processing\n│   │   │   ├── retriever.py       # Vector search\n│   │   │   └── history.py         # Chat history management\n│   │   ├── main.py                # FastAPI application\n│   │   ├── models.py              # Pydantic models\n│   │   └── storage.py             # File storage utilities\n│   ├── chroma_db/                 # Vector database storage\n│   ├── uploads/                   # Uploaded files storage\n│   ├── requirements.txt           # Python dependencies\n│   └── start_server.py           # Server startup script\n├── frontend/\n│   ├── src/\n│   │   ├── App.jsx               # Main React component\n│   │   ├── main.jsx              # React entry point\n│   │   └── index.css             # Tailwind styles\n│   ├── index.html                # HTML template\n│   ├── package.json              # Node.js dependencies\n│   ├── tailwind.config.js        # Tailwind configuration\n│   └── vite.config.js           # Vite configuration\n├── start-backend.bat             # Windows backend starter\n├── start-frontend.bat            # Windows frontend starter\n└── README.md                     # This file\n```\n\n\n## 📋 Prerequisites\n\n- **Python 3.8+**\n- **Node.js 16+**\n- **Google Gemini API Key** ([Get it here](https://makersuite.google.com/app/apikey))\n\n## 🚀 Quick Start\n\n### 1. Clone the Repository\n```bash\ngit clone https://github.com/yashdew3/generic-data-rag-agent.git\ncd generic-data-rag-agent\n```\n\n### 2. Backend Setup\n```bash\n# Navigate to backend directory\ncd backend\n\n# Create virtual environment\npython -m venv .venv\n\n# Activate virtual environment\n# Windows\n.venv\\Scripts\\activate\n# macOS/Linux\nsource .venv/bin/activate\n\n# Install dependencies\npip install -r requirements.txt\n\n# Create environment file\ncp .env.example .env\n```\n\n### 3. Environment Configuration\nCreate a `.env` file in the backend directory:\n```env\nGEMINI_API_KEY=your_gemini_api_key_here\nGEMINI_MODEL=gemini-2.5-flash\nFRONTEND_ORIGIN=http://localhost:5173\n```\n\n### 4. Frontend Setup\n```bash\n# Navigate to frontend directory (new terminal)\ncd frontend\n\n# Install dependencies\nnpm install\n```\n\n### 5. Start the Application\n\n#### Option 1: Using Batch Files (Windows)\n```bash\n# Start backend (from root directory)\nstart-backend.bat\n\n# Start frontend (from root directory)  \nstart-frontend.bat\n```\n\n#### Option 2: Manual Start\n```bash\n# Terminal 1 - Backend\ncd backend\npython start_server.py\n\n# Terminal 2 - Frontend  \ncd frontend\nnpm run dev\n```\n\n### 6. Access the Application\n- **Frontend**: http://localhost:5173\n- **Backend API**: http://localhost:8000\n- **API Documentation**: http://localhost:8000/docs\n\n## 📖 Usage Guide\n\n### 1. Upload Files\n- Click the **\"Choose Files\"** button\n- Select CSV, Excel, PDF, or text files\n- Files are automatically processed and indexed\n\n### 2. Chat with Your Data\n- Type natural language questions about your uploaded data\n- Examples:\n  - \"What are the main trends in this dataset?\"\n  - \"Summarize the key findings from the uploaded report\"\n  - \"Show me insights about sales performance\"\n\n## 🔧 API Endpoints\n\n### File Management\n- `POST /files/upload` - Upload and process files\n- `GET /files/list` - List uploaded files\n- `DELETE /files/{file_id}` - Delete a file\n\n### Chat System\n- `POST /chat/message` - Send a chat message\n- `GET /chat/history/{session_id}` - Get chat history\n\n### History Management\n- `GET /history/sessions` - List all chat sessions\n- `DELETE /history/sessions/{session_id}` - Delete a session\n\n\n## 🧪 Testing\n\n### Backend Tests\n```bash\ncd backend\npython test_system.py\n```\n\n### Frontend Development\n```bash\ncd frontend\nnpm run lint    # ESLint checking\nnpm run build   # Production build\nnpm run preview # Preview production build\n```\n\n## 🔒 Security Features\n\n- **CORS Protection**: Configurable origin restrictions\n- **File Validation**: Secure file type checking\n- **API Key Management**: Environment-based configuration\n- **Input Sanitization**: Secure data processing\n\n\n## 🤝 Contributing\n\nContributions, issues, and feature requests are welcome! Feel free to check the [issues page](https://github.com/yashdew3/generic-data-rag-agent/issues) (if you have one) or open a new issue to discuss changes. Pull requests are also appreciated.\n\n## 📝 License\n\nThis project is licensed under the MIT License © Yash Dewangan\n\n## Let's Connect\nFeel free to connect or suggest improvements!\n- Built by **Yash Dewangan**\n- 🐙Github: [YashDewangan](https://github.com/yashdew3)\n- 📧Email: [yashdew06@gmail.com](mailto:yashdew06@gmail.com)\n- 🔗Linkedin: [YashDewangan](https://www.linkedin.com/in/yash-dewangan/)\n\n---\n\n**Built with ❤️ for intelligent data interaction**\n\n*This project demonstrates modern RAG architecture with production-ready code quality and comprehensive documentation.*\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyashdew3%2Fgeneric-data-rag-agent","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyashdew3%2Fgeneric-data-rag-agent","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyashdew3%2Fgeneric-data-rag-agent/lists"}