{"id":15203932,"url":"https://github.com/fx2y/datanarrate","last_synced_at":"2026-01-31T10:31:39.127Z","repository":{"id":252664855,"uuid":"841044145","full_name":"fx2y/DataNarrate","owner":"fx2y","description":"[WIP] LLM-powered agent for adaptive data analysis across multiple sources. Uses natural language for complex queries, visualizations, and insights. Features autonomous planning, SQL/Elasticsearch generation, and AI storytelling. Built with LangChain, GPT-4, FastAPI, and React.","archived":false,"fork":false,"pushed_at":"2024-09-13T02:59:31.000Z","size":1858,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-01-16T06:16:47.636Z","etag":null,"topics":["ai","data-analysis","data-visualization","elasticsearch","fastapi","gpt-4","langchain","machine-learning","nlp","react","sql"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fx2y.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-11T13:27:53.000Z","updated_at":"2025-01-07T06:13:37.000Z","dependencies_parsed_at":"2024-09-24T06:00:36.881Z","dependency_job_id":null,"html_url":"https://github.com/fx2y/DataNarrate","commit_stats":{"total_commits":35,"total_committers":1,"mean_commits":35.0,"dds":0.0,"last_synced_commit":"cb7d442076f9d36562475bf3e193471889c18c2f"},"previous_names":["fx2y/datanarrate"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fx2y%2FDataNarrate","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fx2y%2FDataNarrate/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fx2y%2FDataNarrate/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fx2y%2FDataNarrate/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fx2y","download_url":"https://codeload.github.com/fx2y/DataNarrate/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":242078478,"owners_count":20068557,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","data-analysis","data-visualization","elasticsearch","fastapi","gpt-4","langchain","machine-learning","nlp","react","sql"],"created_at":"2024-09-28T05:04:18.981Z","updated_at":"2026-01-31T10:31:39.095Z","avatar_url":"https://github.com/fx2y.png","language":"Python","readme":"# LLM-Powered Adaptive Data Analysis Agent\n\n## Overview\n\nThis project implements an advanced, LLM-powered agent capable of performing complex data analysis tasks across multiple\ndata sources. Using natural language interactions, the agent can query databases, generate visualizations, and create\ndata-driven narratives, adapting its approach based on user intent and available data.\n\n## Key Features\n\n- LLM-driven reasoning for task planning and execution\n- Dynamic generation of SQL and Elasticsearch queries\n- Autonomous data visualization selection and creation\n- AI-powered data storytelling and insight generation\n- Multi-step, self-correcting workflow with explicit reasoning\n- Seamless switching between data sources based on query context\n- Interactive refinement of queries and outputs\n\n## Technical Stack\n\n- LangChain \u0026 LangGraph: Agent orchestration and reasoning\n- OpenAI GPT-4: Core language model for decision-making and content generation\n- FastAPI: Backend API for agent interactions\n- MySQL \u0026 Elasticsearch: Supported data sources\n- Langfuse: Agent tracing and performance monitoring\n- React \u0026 D3.js/Plotly: Frontend for user interaction and data visualization\n\n## Agent Architecture\n\n1. Intent Classifier: Determines the high-level goal of the user's request\n2. Query Analyzer: Distinguishes between data retrieval, visualization, and storytelling tasks\n3. Task Planner: Breaks down the goal into a series of actionable steps\n4. Context Manager: Maintains and updates the agent's understanding of the current state\n5. Schema Retriever: Fetches database schemas and Elasticsearch mappings\n6. Query Generator: Creates SQL and Elasticsearch queries based on user intent\n7. Query Validator: Ensures generated queries are valid and safe to execute\n8. Tool Selector: Chooses appropriate tools (e.g., SQL query, visualization) for each step\n9. Execution Engine: Runs selected tools and processes their outputs\n10. Reasoning Engine: Evaluates results, makes decisions, and plans next steps\n11. Output Generator: Formulates human-readable responses and visualizations\n12. Visualization Generator: Creates appropriate data visualizations\n13. Storyline Creator: Generates narrative insights from data analysis\n\n## Example Interaction\n\nUser: \"Analyze our Q2 sales performance and visualize the top-performing products.\"\n\nAgent:\n\n1. Classifies intent as a multi-step analysis task\n2. Analyzes query to determine data retrieval and visualization needs\n3. Plans steps: retrieve Q2 sales data, identify top products, generate visualization\n4. Retrieves relevant database schema\n5. Generates and validates SQL query to fetch Q2 sales data\n6. Executes query and processes results to identify top-performing products\n7. Selects and generates appropriate visualization (e.g., bar chart)\n8. Creates a data-driven narrative summarizing key insights\n9. Presents visualization, summary, and storyline to user\n\n## Setup and Usage\n\n[Setup instructions here]\n\n## Extending the Agent\n\nTo add new capabilities:\n\n1. Implement a new Tool class (e.g., NewDataSourceTool, AdvancedVisualizationTool)\n2. Update the Tool Selector to consider the new tool\n3. Enhance the Task Planner and Query Analyzer to incorporate the new capability\n4. Add relevant prompts and few-shot examples for the LLM\n5. Extend the Schema Retriever and Query Generator if adding a new data source\n6. Update the Visualization Generator for new chart types or data representations\n7. Enhance the Storyline Creator to incorporate new types of insights\n\n## Contributing\n\nWe welcome contributions! See our [Contributing Guide](CONTRIBUTING.md) for details.\n\n## License\n\nThis project is licensed under the Apache-2.0 License - see the [LICENSE](LICENSE) file for details.","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffx2y%2Fdatanarrate","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffx2y%2Fdatanarrate","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffx2y%2Fdatanarrate/lists"}