{"id":28098791,"url":"https://github.com/yswa-var/locoforge","last_synced_at":"2026-01-20T16:25:11.484Z","repository":{"id":291067210,"uuid":"975697471","full_name":"yswa-var/LocoForge","owner":"yswa-var","description":"prompt to SQL, NoSQL and GoogleDrive task executing agent ","archived":false,"fork":false,"pushed_at":"2025-09-04T08:10:47.000Z","size":75833,"stargazers_count":2,"open_issues_count":0,"forks_count":4,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-10-08T14:11:00.068Z","etag":null,"topics":["agentic-ai","langgraph","langgraph-python","nosql","python","sql"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/yswa-var.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-04-30T18:39:30.000Z","updated_at":"2025-07-27T01:09:22.000Z","dependencies_parsed_at":"2025-05-02T08:20:10.467Z","dependency_job_id":"467a20ee-d8f1-4383-b8e1-fc9578801a81","html_url":"https://github.com/yswa-var/LocoForge","commit_stats":null,"previous_names":["yswa-var/locoforge"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/yswa-var/LocoForge","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yswa-var%2FLocoForge","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yswa-var%2FLocoForge/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yswa-var%2FLocoForge/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yswa-var%2FLocoForge/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/yswa-var","download_url":"https://codeload.github.com/yswa-var/LocoForge/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yswa-var%2FLocoForge/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28606992,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-20T16:10:39.856Z","status":"ssl_error","status_checked_at":"2026-01-20T16:10:39.493Z","response_time":117,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agentic-ai","langgraph","langgraph-python","nosql","python","sql"],"created_at":"2025-05-13T17:58:50.658Z","updated_at":"2026-01-20T16:25:11.468Z","avatar_url":"https://github.com/yswa-var.png","language":"Python","readme":"# 🚀 LocoForge: Advanced AI-Powered Database Orchestration System\n\n\u003e **A sophisticated hybrid database query orchestration system built with LangGraph, featuring intelligent query classification, multi-agent execution, seamless SQL/NoSQL integration, and professional data engineering support.**\n\n[![Python](https://img.shields.io/badge/Python-3.8+-blue.svg)](https://python.org)\n[![LangGraph](https://img.shields.io/badge/LangGraph-Latest-green.svg)](https://langchain-ai.github.io/langgraph/)\n[![OpenAI](https://img.shields.io/badge/OpenAI-GPT--4o--mini-orange.svg)](https://openai.com)\n[![Architecture](https://img.shields.io/badge/Architecture-Hybrid%20Orchestrator-purple.svg)]()\n[![Deployment](https://img.shields.io/badge/Deployment-Render%20Ready-blue.svg)](https://render.com)\n[![Web Interface](https://img.shields.io/badge/Web%20Interface-Flask%20API-red.svg)](https://flask.palletsprojects.com/)\n\n## 🎯 Project Overview\n\nhttps://www.youtube.com/watch?v=aKE-0c00JQE\n\u003cimg width=\"1438\" alt=\"Screenshot 2025-06-27 at 1 30 14 PM\" src=\"https://github.com/user-attachments/assets/6d4cb976-40f4-4152-b681-ae8580e9b21c\" /\u003e\n\nLocoForge is a cutting-edge **AI-powered database orchestration system** that intelligently routes and executes queries across multiple database types (SQL and NoSQL) using advanced graph-based workflows. The system leverages **LangGraph** for state management and **GPT-4o-mini** for intelligent query classification and decomposition.\n\n### 🌟 Key Features\n\n- **🤖 Intelligent Query Classification**: AI-powered domain and intent recognition with complexity assessment\n- **🔄 Multi-Agent Orchestration**: Seamless SQL and NoSQL agent coordination\n- **📊 Hybrid Query Processing**: Complex queries spanning multiple database types\n- **🎯 Graph-Based Workflow**: Stateful execution with conditional routing\n- **📈 Result Aggregation**: Intelligent combination of multi-source results\n- **🔄 Context Management**: Persistent conversation history and state tracking\n- **🔧 LangGraph Studio Integration**: Real-time workflow visualization and debugging\n- **👨‍💼 Data Engineer Agent**: Professional handling of unclear, technical, and non-domain queries\n- **🌐 Web Interface**: RESTful API with health checks and database statistics\n- **🚀 Production Ready**: Docker support and Render deployment configuration\n- **🛡️ Enhanced Error Handling**: Graceful degradation and comprehensive error recovery\n\n## 🏗️ Architecture\n\n### Core Components\n\n![Editor _ Mermaid Chart-2025-06-27-083351](https://github.com/user-attachments/assets/8221623d-2f14-4bb8-86f5-feb5746d32bb)\n\n### Enhanced Workflow Graph\n\nThe system implements a sophisticated **state machine** using LangGraph with the following nodes:\n\n1. **`classify_query`** - AI-powered query domain, intent, and complexity classification\n2. **`decompose_query`** - Complex query decomposition into sub-queries\n3. **`route_to_agents`** - Intelligent routing decision making\n4. **`sql_agent`** - SQL query execution (Employee Management)\n5. **`nosql_agent`** - NoSQL query execution (Supplies Database)\n6. **`data_engineer`** - Professional handling of unclear/technical queries\n7. **`aggregate_results`** - Multi-source result combination\n8. **`update_context`** - Conversation state management\n9. **`format_response`** - Final response formatting\n\n\u003cimg width=\"1021\" alt=\"Screenshot 2025-06-29 at 9 21 49 PM\" src=\"https://github.com/user-attachments/assets/3bbfe0c3-c950-4429-b194-9c6dcbb91e7f\" /\u003e\n\n## 🛠️ Technical Implementation\n\n### Enhanced State Management\n\n```python\nclass OrchestratorState(TypedDict):\n    messages: List[BaseMessage]           # Conversation history\n    current_query: str                    # Current user query\n    query_domain: QueryDomain            # Classified domain (EMPLOYEE/SUPPLIES/HYBRID/UNCLEAR/TECHNICAL)\n    query_intent: QueryIntent            # Query intent (SELECT/ANALYZE/COMPARE/AGGREGATE/CLARIFY/EXPLAIN)\n    query_complexity: QueryComplexity    # NEW: Complexity assessment (SIMPLE/MEDIUM/COMPLEX)\n    sub_queries: Dict[str, str]          # Decomposed sub-queries\n    sql_results: Optional[Dict[str, Any]] # SQL agent results\n    nosql_results: Optional[Dict[str, Any]] # NoSQL agent results\n    combined_results: Optional[Dict[str, Any]] # Aggregated results\n    context_history: List[Dict[str, Any]] # Execution context\n    execution_path: List[str]            # Workflow execution trace\n    error_message: Optional[str]         # Error handling\n    clarification_suggestions: Optional[List[str]] # NEW: Query refinement suggestions\n    data_engineer_response: Optional[str] # NEW: Data engineer agent responses\n```\n\n### Enhanced Conditional Routing Logic\n\nThe system implements sophisticated routing decisions with new data engineer support:\n\n```python\ndef route_decision(state: OrchestratorState) -\u003e str:\n    \"\"\"Intelligent routing based on query domain and complexity\"\"\"\n    domain = state[\"query_domain\"]\n\n    if domain == QueryDomain.EMPLOYEE:\n        return \"sql_only\"\n    elif domain == QueryDomain.SUPPLIES:\n        return \"nosql_only\"\n    elif domain == QueryDomain.HYBRID:\n        return \"both_agents\"\n    elif domain in [QueryDomain.UNCLEAR, QueryDomain.TECHNICAL]:\n        return \"data_engineer\"  # NEW: Route unclear queries to Data Engineer\n    else:\n        return \"error_handling\"\n```\n\n### Data Engineer Agent\n\nThe new **Data Engineer Agent** provides professional handling for:\n\n- **Ambiguous Queries**: \"Show me everything\", \"What's the data?\"\n- **Non-Domain Queries**: \"What's the weather like?\", \"Tell me a joke\"\n- **Technical Queries**: \"SELECT \\* FROM employees\", \"Show schema\"\n- **Overly Complex Queries**: Performance-impacting queries\n\n```python\nclass DataEngineerAgent:\n    def analyze_query(self, query: str) -\u003e Dict[str, Any]:\n        \"\"\"Analyze query and provide professional guidance\"\"\"\n\n    def provide_clarification_suggestions(self, query: str, analysis: Dict[str, Any]) -\u003e List[str]:\n        \"\"\"Generate specific clarification suggestions\"\"\"\n\n    def handle_technical_query(self, query: str) -\u003e Dict[str, Any]:\n        \"\"\"Handle SQL/NoSQL syntax and schema questions\"\"\"\n\n    def handle_non_domain_query(self, query: str) -\u003e Dict[str, Any]:\n        \"\"\"Handle queries outside system domain\"\"\"\n```\n\n### AI-Powered Query Classification\n\n```python\ndef classify_intent(self, query: str) -\u003e Tuple[QueryDomain, QueryIntent, QueryComplexity]:\n    \"\"\"Use GPT-4o-mini to classify query domain, intent, and complexity\"\"\"\n    system_prompt = \"\"\"\n    You are an expert query classifier for a hybrid database system with:\n    1. SQL Database: Employee management (employees, departments, projects, attendance)\n    2. NoSQL Database: Sample Supplies (sales, customers, items, stores)\n\n    Classify the query into:\n    - DOMAIN: employee, supplies, hybrid, unclear, technical\n    - INTENT: select, analyze, compare, aggregate, clarify, explain\n    - COMPLEXITY: simple, medium, complex\n    \"\"\"\n    # LLM-based classification logic\n```\n\n## 🌐 Web Interface\n\n### RESTful API Endpoints\n\nThe system now includes a complete web interface with the following endpoints:\n\n```bash\n# Health check\nGET /health\n\n# Database statistics\nGET /api/database/stats\n\n# Natural language query processing\nPOST /api/query\n{\n    \"query\": \"Show me employee salaries and supplies data\"\n}\n\n# Direct SQL query execution\nPOST /api/database/query\n{\n    \"sql\": \"SELECT COUNT(*) FROM employees WHERE department = 'Engineering'\"\n}\n```\n\n### Example API Usage\n\n```bash\n# Health check\ncurl https://your-app-name.onrender.com/health\n\n# Natural language query\ncurl -X POST https://your-app-name.onrender.com/api/query \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"query\": \"How many employees are in Engineering?\"}'\n\n# Database statistics\ncurl https://your-app-name.onrender.com/api/database/stats\n```\n\n## 🚀 Getting Started\n\n### Prerequisites\n\n- Python 3.8+\n- MongoDB (for NoSQL operations)\n- SQLite/PostgreSQL (for SQL operations)\n- OpenAI API Key\n\n### Installation\n\n```bash\n# Clone the repository\ngit clone https://github.com/yourusername/LocoForge.git\ncd LocoForge\n\n# Create virtual environment\npython -m venv venv\nsource venv/bin/activate  # On Windows: venv\\Scripts\\activate\n\n# Install dependencies\npip install -r requirements.txt\n\n# Set up environment variables\ncp env_template.txt .env\n# Edit .env with your OpenAI API key and database configurations\n```\n\n### Environment Configuration\n\n```bash\n# .env file\nOPENAI_API_KEY=your_openai_api_key_here\nMONGO_DB=mongodb://localhost:27017/\nSQL_DB=sqlite:///employee_management.db\nENVIRONMENT=development\n```\n\n### Quick Start\n\n#### Using the Web Interface\n\n```bash\n# Start the Flask application\npython app.py\n\n# Access the API at http://localhost:5000\n```\n\n#### Using the LangGraph Workflow\n\n```python\nfrom my_agent.agent import graph\nfrom my_agent.utils.state import OrchestratorState\n\n# Initialize the workflow\nworkflow = graph\n\n# Create a query\nstate = OrchestratorState(\n    messages=[HumanMessage(content=\"Show me employee salaries and supplies data\")],\n    current_query=\"Show me employee salaries and supplies data\"\n)\n\n# Execute the workflow\nresult = workflow.invoke(state)\nprint(result[\"combined_results\"])\n```\n\n## �� Database Schemas\n\n### SQL Database (Employee Management)\n\n```sql\n-- Employees table\nCREATE TABLE employees (\n    id INTEGER PRIMARY KEY,\n    name TEXT NOT NULL,\n    department TEXT,\n    salary REAL,\n    hire_date DATE,\n    manager_id INTEGER\n);\n\n-- Departments table\nCREATE TABLE departments (\n    id INTEGER PRIMARY KEY,\n    name TEXT NOT NULL,\n    budget REAL\n);\n\n-- Projects table\nCREATE TABLE projects (\n    id INTEGER PRIMARY KEY,\n    name TEXT NOT NULL,\n    department_id INTEGER,\n    start_date DATE,\n    end_date DATE\n);\n```\n\n### NoSQL Database (Supplies Database)\n\n```javascript\n// Sales collection\n{\n  \"_id\": ObjectId,\n  \"couponUsed\": true,\n  \"customer\": {\n    \"age\": 30,\n    \"email\": \"customer@example.com\",\n    \"gender\": \"F\",\n    \"satisfaction\": 5\n  },\n  \"items\": [\n    {\n      \"name\": \"Office Supplies\",\n      \"price\": 25.99,\n      \"quantity\": 2,\n      \"tags\": [\"office\", \"stationery\"]\n    }\n  ],\n  \"purchaseMethod\": \"Online\",\n  \"saleDate\": ISODate(\"2023-01-15\"),\n  \"storeLocation\": \"Denver\"\n}\n\n// Comments collection\n{\n  \"_id\": ObjectId,\n  \"movie_id\": ObjectId,\n  \"name\": \"User Name\",\n  \"email\": \"user@example.com\",\n  \"text\": \"Great movie!\",\n  \"date\": ISODate(\"2024-01-15\")\n}\n```\n\n## 🧪 Testing\n\n### Run Test Suite\n\n```bash\n# Test the orchestrator workflow\npython test_orchestrator.py\n\n# Test individual agents\npython test_sql_agent.py\npython test_nosql_agent.py\n\n# Test cross-database queries\npython test_cross_database_queries.py\n\n# Test LangGraph Studio integration\npython test_langgraph_studio.py\n\n# Test edge cases and error handling\npython test_edge_cases.py\n\n# Test deployment configuration\npython test_deployment.py\n```\n\n### Example Queries\n\n```python\n# Employee queries (SQL)\n\"Show me all employees in the Engineering department\"\n\"What's the average salary by department?\"\n\"Find employees hired in the last 6 months\"\n\n# Supplies queries (NoSQL)\n\"Show me all sales with high customer satisfaction\"\n\"What are the top-selling items this year?\"\n\"List sales by store location\"\n\"Show total sales by customer age group\"\n\n# Hybrid queries (combining both databases)\n\"Show which employees bought office supplies\"\n\"Compare employee salaries with sales data\"\n\n# Unclear queries (Data Engineer)\n\"Show me everything\"\n\"What's the weather like?\"\n\"SELECT * FROM employees\"\n```\n\n## 🚀 Deployment\n\n### Render Deployment\n\nThe system includes complete Render deployment configuration:\n\n```bash\n# Deploy to Render using render.yaml\n# The system automatically configures:\n# - Flask web application\n# - Gunicorn production server\n# - Environment variable management\n# - Health checks and monitoring\n```\n\n### Docker Deployment\n\n```bash\n# Build and run with Docker\ndocker build -t locoforge .\ndocker run -p 5000:5000 locoforge\n```\n\n### Environment Variables for Production\n\n```bash\nOPENAI_API_KEY=your_openai_api_key\nGOOGLE_API_KEY=your_google_api_key\nDATABASE_URL=sqlite:///employee_management.db\nENVIRONMENT=production\n```\n\n## 🔧 Advanced Features\n\n### LangGraph Studio Integration\n\nThe system includes full LangGraph Studio support for real-time workflow visualization:\n\n```bash\n# Start LangGraph Studio\nlanggraph studio\n\n# Access the interface at http://localhost:8123\n```\n\n### Enhanced Error Handling\n\nThe system includes comprehensive error handling:\n\n- **Agent Initialization Failures**: Graceful degradation when agents are unavailable\n- **Query Execution Errors**: Detailed error reporting and recovery\n- **Network Connectivity**: Retry mechanisms for database connections\n- **State Recovery**: Persistent state management across sessions\n- **Data Engineer Fallback**: Professional responses for all query types\n\n### Custom Agent Development\n\nExtend the system with custom agents:\n\n```python\nclass CustomAgent:\n    def __init__(self):\n        self.model = ChatOpenAI(model=\"gpt-4o-mini\")\n\n    def execute_query(self, query: str) -\u003e Dict[str, Any]:\n        # Custom query execution logic\n        pass\n```\n\n## 📈 Performance \u0026 Scalability\n\n### Optimization Strategies\n\n- **Lazy Loading**: Agents initialized only when needed\n- **Connection Pooling**: Efficient database connection management\n- **Caching**: Query result caching for repeated requests\n- **Async Processing**: Non-blocking query execution where possible\n- **State Management**: Efficient state updates and context tracking\n\n### Monitoring \u0026 Logging\n\n```python\nimport logging\n\n# Comprehensive logging throughout the workflow\nlogger = logging.getLogger(__name__)\nlogger.info(\"🔄 Initializing orchestrator...\")\nlogger.info(\"✅ SQL agent initialized successfully\")\nlogger.warning(\"⚠️ NoSQL agent not available\")\nlogger.info(\"👨‍💼 Data Engineer Agent ready for unclear queries\")\n```\n\n## 📚 Documentation\n\n### Additional Guides\n\n- **[DEPLOYMENT_GUIDE.md](DEPLOYMENT_GUIDE.md)** - Complete deployment instructions\n- **[EDGE_CASE_HANDLING_GUIDE.md](EDGE_CASE_HANDLING_GUIDE.md)** - Data Engineer Agent details\n- **[SQL_AGENT_README.md](SQL_AGENT_README.md)** - SQL agent documentation\n- **[NOSQL_AGENT_README.md](NOSQL_AGENT_README.md)** - NoSQL agent documentation\n- **[MONGOENGINE_MIGRATION_README.md](MONGOENGINE_MIGRATION_README.md)** - Database migration guide\n\n## 🤝 Contributing\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\n3. Commit your changes (`git commit -m 'Add amazing feature'`)\n4. Push to the branch (`git push origin feature/amazing-feature`)\n5. Open a Pull Request\n\n## 📄 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## 🙏 Acknowledgments\n\n- **Our Team** [Yash Varshney](https://github.com/yswa-var/), [dhruv jain](https://github.com/DhruvJ2), [Aditya Mandal](https://github.com/Aditya-1304)\n- **Langgraph Team**\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyswa-var%2Flocoforge","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyswa-var%2Flocoforge","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyswa-var%2Flocoforge/lists"}