{"id":28284100,"url":"https://github.com/ako1983/modular-llm-architecture","last_synced_at":"2025-10-11T04:24:17.948Z","repository":{"id":273433969,"uuid":"919713580","full_name":"ako1983/modular-llm-architecture","owner":"ako1983","description":"Exploring the cognitive architecture and design of a multi-agent system leveraging Large Language Models (LLMs) for scalable workflows.","archived":false,"fork":false,"pushed_at":"2025-03-19T22:19:38.000Z","size":126,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-17T16:44:26.752Z","etag":null,"topics":["artificial-intelligence","large-language-models","llm-agent","multiagent","text-to-sql"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ako1983.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-01-20T21:45:14.000Z","updated_at":"2025-06-15T02:24:48.000Z","dependencies_parsed_at":"2025-01-20T22:50:36.099Z","dependency_job_id":"f1a75cbb-a6c3-4cbb-9aad-a1a61e474154","html_url":"https://github.com/ako1983/modular-llm-architecture","commit_stats":null,"previous_names":["ako1983/modular-llm-architecture"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ako1983/modular-llm-architecture","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ako1983%2Fmodular-llm-architecture","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ako1983%2Fmodular-llm-architecture/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ako1983%2Fmodular-llm-architecture/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ako1983%2Fmodular-llm-architecture/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ako1983","download_url":"https://codeload.github.com/ako1983/modular-llm-architecture/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ako1983%2Fmodular-llm-architecture/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279006248,"owners_count":26084060,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-11T02:00:06.511Z","response_time":55,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","large-language-models","llm-agent","multiagent","text-to-sql"],"created_at":"2025-05-21T17:13:04.655Z","updated_at":"2025-10-11T04:24:17.936Z","avatar_url":"https://github.com/ako1983.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Modular-LLM-Architecture\n**Exploring the cognitive architecture and design of a multi-agent system leveraging Large Language Models (LLMs) for scalable workflows.**\n\n\n\n## Architecture Diagram\n\n![Modular LLM Architecture Diagram](assets/multi_agent_workflow_diagram.png)\n\nThe diagram illustrates the key components of the system, including the Conversation Manager, Router, LLM Manager, and specialized agents.\n\n---\n\n## Core Features\n- Modular multi-agent design.\n- Dynamic LLM integration with support for multiple models.\n- Separation of concerns for improved maintainability.\n\n---\n### Introduction\n\nIn the era of Large Language Models (LLMs), designing systems that handle diverse tasks with scalability and flexibility is a growing challenge. The cognitive architecture presented here leverages modular design principles to build a robust multi-agent system, featuring clear separation of responsibilities and dynamic LLM integration. This blog explores the architectural details, using the attached diagram as a guide.\n\n---\n\n### Core Components of the Architecture\n\n1. **Conversation Manager (Brain and Orchestrator):**\n   - The central \"brain\" of the system. It processes user inputs and orchestrates tasks by delegating them to specialized agents through the **Router**.\n   - Maintains state for multi-step conversations, ensuring continuity and coherence.\n\n2. **Router (Task Assignment):**\n   - A lightweight component that assigns tasks to the appropriate agent based on their expertise.\n   - Decouples task routing from task execution for better maintainability.\n\n3. **LLMManager (Flexible LLM Integration):**\n   - Centralizes interaction with LLMs, enabling seamless integration of multiple models.\n   - Supports dynamic model selection and configuration changes, such as adjusting temperature or choosing specific LLMs for tasks.\n   - Ideas for caching and small-model fallback mechanisms have been proposed but are not yet implemented.\n\n\n### Specialized Agents\n\nEach agent is designed with a specific role, ensuring a clear separation of concerns:\n\n- **SQL Agent (Natural Language to SQL):** Converts user prompts into SQL queries, currently interacting with the **Database Schema via prompt injection** to fetch metadata. In production, this will transition to a graph-based Retrieval-Augmented Generation (RAG) system for enhanced flexibility and scalability.\n\n- **SQL Debugging Agent:** Debugs and corrects SQL queries, ensuring accurate execution. It refines queries up to three times, addressing issues like spelling errors or incomplete logic, to ensure meaningful outputs.\n\n- **Analyser Agent (Data Analysis):** Analyzes structured or unstructured data to provide insights. Future enhancements include integrating LLM results with business data from platforms like Slack, Microsoft Teams, and OneDrive.\n\n\n- **Charting Agent:** Writes Vega specifications for data visualization.\n- **Vega Debugging Agent:** Corrects and validates Vega JSON specifications for accurate visual rendering.\n- **Knowledge Agent:** Retrieves answers from document repositories, relying on the **Document RAG** for context. It addresses business-specific queries like defining terms such as 'churn' at Peacock, ensuring concise and clear summaries.\n- **Follow-Up Agent:** Engages with users to provide iterative feedback and refine responses. It also suggests 2-3 similar analyses to help users explore related questions.\n- **Clarification Agent:** Ensures the highest level of accuracy by detecting ambiguity and prompting for clarification. For instance, it can inquire, \"What do you consider a top show? Should it be based on total hours viewed, other metrics, or specific time periods like a calendar month?\"\n- **Note Agent:** Streamlines analyses by capturing preferences like focusing on Android devices in New York state. This eliminates repetitive specifications, ensuring appropriate SQL queries and tailored visualizations.\n\n- **Summary Agent (Result Compilation):** Summarizes outputs from various agents, compiling user-friendly reports. It always runs at the end, providing concise summaries to ensure clarity.\n\n\n---\n\n### Key Features of the Design\n\n1. **Modularity and Maintainability:**\n   - Each component focuses on a single responsibility, improving maintainability and scalability.\n   - The **Router** and **Conversation Manager** allow seamless integration of new agents.\n\n2. **Flexible LLM Integration:**\n   - The **LLMManager** supports a plug-and-play approach, enabling dynamic LLM selection.\n   - Designed for integration with vector databases and embedding techniques for advanced retrieval functionalities.\n\n3. **Separation of Concerns:**\n   - Clear division of tasks between routing, orchestration, and execution layers prevents overlap and reduces complexity.\n\n4. **Resilience:**\n   - Built-in mechanisms for retries, error handling, and debugging ensure reliable performance under diverse scenarios.\n\n---\n\n### Workflow in Action\n\n1. **User Input:** The **Conversation Manager** receives a query, such as \"What are the top 5 most viewed shows across all regions?\"\n2. **Routing:** The **Router** assigns the task to the **SQL Agent**. Before this step, the **Clarification Agent** intervenes to detect ambiguity and refines the user’s intent by asking up to three questions.\n3. **LLM and Database Integration:** The **SQL Agent** interacts with the **LLMManager** to generate a SQL query, retrieves metadata from the **Database Schema**, and sends the query to the **Query Executer**.\n4. **Validation and Debugging:** If the SQL Agent encounters challenges like errors or empty results, the **SQL Debugging Agent** provides assistance, refining the query as needed.\n5. **Visualization:** The **Vega Agent** generates a JSON specification for visualization, enabling insightful and interactive outputs.\n6. **Summarization:** The **Summary Agent** compiles findings into concise reports, limited to three sentences for clarity.\n7. **Follow-Up:** The **Follow-Up Agent** recommends 2-3 similar analyses to assist users in exploring related questions.\n\n---\n\n### Real-World Applications\n\n- **Business Intelligence (BI):** Generating dynamic dashboards and actionable insights from natural language queries.\n- **Data Exploration:** Allowing non-technical users to interact with databases intuitively.\n- **Knowledge Management:** Streamlining access to enterprise knowledge repositories.\n\n---\n\n### Challenges and Future Directions\n\n- **Scalability:** Ensuring the system scales effectively with increasing user demands.\n- **Security:** Safeguarding sensitive data by leveraging open-source and small models where feasible to minimize exposure.\n- **Agent Expansion:** Adding more specialized agents for domain-specific tasks.\n\n---\n\n### Conclusion\n\nThis cognitive architecture highlights the strengths of modular design and multi-agent systems in effectively utilizing LLMs for complex workflows. By integrating flexible LLM capabilities and maintaining a clear separation of responsibilities, it establishes a robust framework for building scalable and efficient AI-driven solutions.\n\n---\n\n### Suggestions for Moving Forward\n\n1. **Enhanced Task Queue Features:** Implement advanced parallelism techniques and error recovery using task retry logic for more resilient execution.\n2. **Enhanced Agent Collaboration:** Introduce inter-agent communication protocols to improve collaboration and reduce redundant processing.\n3. **Observation and Evaluation Tools:** Incorporate tools like Arize.ai or LangSmith to monitor and evaluate system performance.\n4. **Dynamic Feedback Mechanisms:** Implement advanced user feedback loops to refine agent outputs in real time.\n5. **Agent Response Evaluation:** Integrate solutions like Galileo or similar platforms to evaluate the performance of agent responses for comprehensive insights.\n6. **Memory Enhancements:** Use tools like LangGraph to improve system memory and context recall.\n7. **Scalability Optimization:** Focus on performance for large-scale deployments, leveraging platforms like Predibase for enhanced scalability and adaptability.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fako1983%2Fmodular-llm-architecture","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fako1983%2Fmodular-llm-architecture","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fako1983%2Fmodular-llm-architecture/lists"}