{"id":30692963,"url":"https://github.com/reesavgupta/sql-agent","last_synced_at":"2026-04-18T17:33:14.827Z","repository":{"id":304999209,"uuid":"1021240086","full_name":"ReesavGupta/sql-agent","owner":"ReesavGupta","description":"A SQL agent for e-commerce data, featuring LLM-driven natural language to SQL, semantic table selection, optimal join planning, and multi-step query generation with validation. Supports vector search, statistical sampling, result pagination, caching, real-time simulation, and automated query optimization for seamless and intelligent integration.","archived":false,"fork":false,"pushed_at":"2025-07-18T09:21:52.000Z","size":208,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-09-02T05:23:20.699Z","etag":null,"topics":["agentic-ai","ai","aiagent","fastapi","genai","generative-ai","python","rag","sql-agent","sqlagent"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ReesavGupta.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-07-17T05:35:22.000Z","updated_at":"2025-07-18T09:21:55.000Z","dependencies_parsed_at":"2025-07-17T21:51:50.741Z","dependency_job_id":"c547e426-2b93-4ca3-838a-0c2f16da230a","html_url":"https://github.com/ReesavGupta/sql-agent","commit_stats":null,"previous_names":["reesavgupta/sql-agent"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ReesavGupta/sql-agent","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ReesavGupta%2Fsql-agent","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ReesavGupta%2Fsql-agent/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ReesavGupta%2Fsql-agent/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ReesavGupta%2Fsql-agent/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ReesavGupta","download_url":"https://codeload.github.com/ReesavGupta/sql-agent/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ReesavGupta%2Fsql-agent/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31977969,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-18T17:30:12.329Z","status":"ssl_error","status_checked_at":"2026-04-18T17:29:59.069Z","response_time":103,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agentic-ai","ai","aiagent","fastapi","genai","generative-ai","python","rag","sql-agent","sqlagent"],"created_at":"2025-09-02T05:07:25.767Z","updated_at":"2026-04-18T17:33:14.810Z","avatar_url":"https://github.com/ReesavGupta.png","language":"Python","readme":"# Quick Commerce SQL Agent\n\nA modern price comparison and analytics platform for quick commerce apps (Blinkit, Zepto, Instamart, BigBasket Now, etc.), enabling real-time tracking of pricing, discounts, and availability across thousands of products using natural language queries.\n\n---\n\n## 🚀 Project Overview\n\nThis project provides:\n- A robust backend SQL agent with semantic table/column selection, multi-step query planning, and real-time data simulation.\n- A modern, responsive frontend (React + Tailwind) for natural language queries and interactive results.\n- Modular, extensible architecture ready for large-scale, multi-platform commerce data.\n\n---\n\n## 🖼️ Frontend UI Preview\n\n![Frontend Preview](/frontend/public/frontend-preview.png)\n\n## ✨ Features \u0026 Deliverables\n\n### 1. **Database Design**\n- **Schema**: Based on the Olist dataset, extended with columns for `current_price`, `discount_percent`, and `in_stock` in the `products` table.\n- **Extensible**: Modular design allows for easy addition of more tables/platforms.\n\n### 2. **Data Integration**\n- **Real-time Simulation**: Python script periodically updates prices, discounts, and stock status with dummy data to mimic real-world changes.\n\n### 3. **SQL Agent**\n- **Semantic Table/Column Selection**: Uses FAISS vector search and HuggingFace embeddings to select relevant schema parts for each query.\n- **LLM-Driven Query Generation**: LLM (Groq) generates SQL using only relevant tables/columns, with prompt engineering for accuracy.\n- **Multi-step Pipeline**: Query generation, execution, and answer synthesis are modular and extensible.\n- **Pagination**: All queries support LIMIT/OFFSET for large result sets.\n- **Caching**: In-memory caching for schema info and query results for performance.\n\n### 4. **Performance**\n- **Optimized for High-Frequency Updates**: Real-time simulation and caching ensure responsiveness.\n- **Concurrent Queries**: FastAPI backend supports async requests.\n\n### 5. **Web Interface**\n- **Modern UI**: Built with React and Tailwind CSS.\n- **Step-by-step Feedback**: Shows progress (\"Thinking...\", \"Generating SQL...\", etc.) for great UX.\n- **Displays**: SQL query, raw SQL result (table), and LLM answer.\n- **Pagination Controls**: Easily page through large result sets.\n\n### 6. **Security**\n- **CORS**: Configured for frontend-backend integration.\n- **Rate Limiting**: (Can be added with FastAPI middleware if needed.)\n\n### 7. **Documentation**\n- **This README**: Explains architecture, setup, and design decisions.\n\n---\n\n## 🏗️ Architecture \u0026 Design Decisions\n\n- **Modular Backend**: Separated into `services/llm.py`, `services/vectorstore.py`, `services/query.py`, and `api/` for clean, testable code.\n- **Semantic Search**: FAISS + HuggingFace embeddings for fast, scalable schema/document retrieval.\n- **LLM Summarization**: LLM generates human-readable schema summaries for focused, accurate SQL generation.\n- **In-memory Caching**: For both schema and query results, balancing speed and simplicity.\n- **Frontend/Backend Decoupling**: REST API with CORS for easy integration and future scaling.\n- **Progressive UI Feedback**: Frontend shows each step of the pipeline for transparency and user trust.\n\n---\n\n## 🛠️ Setup \u0026 Usage\n\n### 1. **Backend**\n- **Install dependencies:**\n  ```bash\n  cd backend\n  pip install -r requirements.txt\n  pip install fastapi uvicorn faiss-cpu\n  ```\n- **Set environment variables:**\n  - Create `.env` in `backend/` with your Groq API key and model:\n    ```\n    GROQ_API_KEY=your_groq_api_key\n    GROQ_MODEL=llama3-8b-8192\n    ```\n- **Ensure database exists:**\n  - Place your SQLite DB at `backend/e-commerce-data/olist.sqlite/olist.sqlite`.\n  - Run the schema update and simulation scripts:\n    ```bash\n    python src/services/db_connection.py\n    python src/services/simulate_realtime_updates.py\n    ```\n- **Start the API:**\n  ```bash\n  uvicorn src.api.main:app --reload\n  ```\n  - API docs: [http://localhost:8000/docs](http://localhost:8000/docs)\n\n### 2. **Frontend**\n- **Install dependencies:**\n  ```bash\n  cd frontend\n  npm install\n  # or\n  yarn\n  ```\n- **Start the app:**\n  ```bash\n  npm run dev\n  # or\n  yarn dev\n  ```\n- **Open in browser:** [http://localhost:5173](http://localhost:5173)\n\n---\n\n## 🧑‍💻 API Usage\n\n- **POST /query**\n  - **Body:**\n    ```json\n    {\n      \"question\": \"Show products with more than 30% discount\",\n      \"page\": 1,\n      \"page_size\": 10\n    }\n    ```\n  - **Response:**\n    ```json\n    {\n      \"question\": \"...\",\n      \"query\": \"...\",\n      \"result\": [...],\n      \"answer\": \"...\",\n      \"page\": 1,\n      \"page_size\": 10\n    }\n    ```\n\n---\n\n## 📝 Sample Queries\n- \"Which app has cheapest onions right now?\"\n- \"Show products with 30%+ discount on Blinkit\"\n- \"Compare fruit prices between Zepto and Instamart\"\n- \"Find best deals for ₹1000 grocery list\"\n- \"List 10 products with their category, price, and discount\"\n\n---\n\n## 📚 Extending \u0026 Customizing\n- **Add new tables/columns**: Update the schema and rerun the vectorstore indexing.\n- **Add new endpoints**: Extend `api/routes.py`.\n- **Change LLM or embeddings**: Update `services/llm.py` or `services/vectorstore.py`.\n- **Add rate limiting/security**: Use FastAPI middleware.\n\n---\n\n## 🏆 Decision Rationale\n- **FAISS over SQLiteVec**: Chosen for speed, reliability, and no native extension requirement.\n- **LLM Summarization**: Ensures only relevant schema is sent to the LLM, improving accuracy and reducing cost.\n- **In-memory caching**: Simple, fast, and effective for development and moderate scale.\n- **Modular codebase**: Enables rapid iteration and easy testing.\n- **Frontend feedback**: Step-by-step UI progress for best-in-class UX.\n\n---\n\n## 📂 Repository Structure\n\n```\nbackend/\n  src/\n    api/\n      main.py\n      routes.py\n    services/\n      db_connection.py\n      llm.py\n      query.py\n      vectorstore.py\n      simulate_realtime_updates.py\nfrontend/\n  src/\n    App.tsx\n    ...\n```\n\n---\n\n## 📄 License\nMIT \n---\n\n## 🙏 Acknowledgements\n- [LangChain](https://github.com/langchain-ai/langchain)\n- [FAISS](https://github.com/facebookresearch/faiss)\n- [HuggingFace](https://huggingface.co/)\n- [FastAPI](https://fastapi.tiangolo.com/)\n- [Tailwind CSS](https://tailwindcss.com/) ","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Freesavgupta%2Fsql-agent","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Freesavgupta%2Fsql-agent","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Freesavgupta%2Fsql-agent/lists"}