{"id":38259004,"url":"https://github.com/aymenfurter/smartrag","last_synced_at":"2026-01-17T01:34:05.864Z","repository":{"id":245751874,"uuid":"816435366","full_name":"aymenfurter/smartrag","owner":"aymenfurter","description":"Deep Research through Multi-Agents, using GraphRAG ","archived":false,"fork":false,"pushed_at":"2025-08-21T14:31:12.000Z","size":2936,"stargazers_count":78,"open_issues_count":13,"forks_count":10,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-08-21T16:59:21.577Z","etag":null,"topics":["autogen","azure","deep-research","gpt-4o","graphrag","llm","multi-agent-systems","multimodal","openai","voice-mode"],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aymenfurter.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-06-17T18:42:24.000Z","updated_at":"2025-08-18T01:39:26.000Z","dependencies_parsed_at":"2024-06-29T10:25:33.470Z","dependency_job_id":"2fac4bb3-95a9-4622-9e48-d1b000b0872d","html_url":"https://github.com/aymenfurter/smartrag","commit_stats":null,"previous_names":["aymenfurter/smartrag"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/aymenfurter/smartrag","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aymenfurter%2Fsmartrag","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aymenfurter%2Fsmartrag/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aymenfurter%2Fsmartrag/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aymenfurter%2Fsmartrag/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aymenfurter","download_url":"https://codeload.github.com/aymenfurter/smartrag/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aymenfurter%2Fsmartrag/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28491603,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-17T00:50:05.742Z","status":"ssl_error","status_checked_at":"2026-01-17T00:43:11.982Z","response_time":107,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["autogen","azure","deep-research","gpt-4o","graphrag","llm","multi-agent-systems","multimodal","openai","voice-mode"],"created_at":"2026-01-17T01:34:05.786Z","updated_at":"2026-01-17T01:34:05.843Z","avatar_url":"https://github.com/aymenfurter.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv id=\"top\"\u003e\u003c/div\u003e\n\n\u003cbr /\u003e\n\u003cdiv align=\"center\"\u003e\n    \u003cimg src=\"assets/multi-agent.png\"\u003e\n\n  \u003ch1 align=\"center\"\u003eSmartRAG: Elevating RAG with Multi-Agent Systems\u003c/h1\u003e\n  \u003cp align=\"center\"\u003e\n    🐞 \u003ca href=\"https://github.com/aymenfurter/smartrag/issues\"\u003eReport Bug\u003c/a\u003e\n    ·\n    💡 \u003ca href=\"https://github.com/aymenfurter/smartrag/issues\"\u003eRequest Feature\u003c/a\u003e\n  \u003c/p\u003e\n  \u003cbr/\u003e \n  \u003cp\u003e\n  \u003ca href=\"https://github.com/aymenfurter/smartrag/actions/workflows/python-tests.yml\"\u003e\n    \u003cimg height=\"30\" src=\"https://github.com/aymenfurter/smartrag/actions/workflows/python-tests.yml/badge.svg\" alt=\"Python Tests\"\u003e\u003c/a\u003e\u0026nbsp; \u003ca href=\"https://portal.azure.com/#create/Microsoft.Template/uri/https%3A%2F%2Fraw.githubusercontent.com%2Faymenfurter%2Fsmartrag%2Fmain%2Finfrastructure%2Fdeployment.json\"\u003e\n  \u003cimg height=\"30\" src=\"https://aka.ms/deploytoazurebutton\" alt=\"Deploy to Azure\"\u003e\n\u003c/a\u003e\n  \u003c/p\u003e\n\n\u003c/div\u003e\n\n## In the Wake of the Generative AI Revolution\n\nWe've seen a surge in GenAI-powered apps. While these apps promise a completely new way to interact with computers, they often don't meet user expectations. SmartRAG is a demonstration app that showcases various concepts to improve Retrieval-Augmented Generation (RAG) applications.\n\n## Main Features\n\n1. **Multiple Query Approaches**: Explore different data ingestion and querying methods, from simple and fast Azure OYD to the advanced GraphRAG approach.\n\n2. **Voice Mode**: Utilize Azure OpenAI's Text-to-Speech (TTS) and Whisper for Speech-to-Text, enabling natural voice conversations.\n\n3. **Advanced Querying**: Use Langchain summarizer or GraphRAG for complex queries in the \"Ask\" section.\n\n4. **Advanced Indexing**: Enhance retrieval accuracy through multi-modal indexing techniques.\n\n5. **Multi-Agent Research**:\n   - **Multi-Agent**: Combine different indexes with agents (critic and researcher) working together for extended durations to find answers.\n   - **AutoGen Integration**: Utilize Microsoft's AutoGen framework to create an ensemble of AI agents for collaborative research on complex topics.\n   - **Time-Bounded Research**: Specify research duration to balance depth of analysis with response time.\n   - **Citation and Verification**: Responses include citations for accuracy verification.\n\n# Deploying SmartRAG\n\nSmartRAG can be easily deployed using the Azure Developer CLI (azd):\n\n1. Ensure you have the Azure Developer CLI installed.\n2. Clone the SmartRAG repository.\n3. Navigate to the project directory.\n4. Run the following command:\n\n   ```\n   azd up\n   ```\n   \n5. Some featurse may not be available until the app is restarted once.\n\n## Voice Mode Deployment Considerations\n\nSmartRAG includes a Voice Mode feature that uses Azure OpenAI's Text-to-Speech (TTS) and Whisper for Speech-to-Text capabilities. Please note:\n\n- The TTS feature is currently available only in the Sweden Central and North Central US regions.\n- If you want to use Voice Mode, ensure you deploy to one of these regions.\n- If you don't need Voice Mode, you can modify the deployment script to remove this component and deploy the rest of the application in any supported Azure region.\n\nThe deployment process uses Bicep scripts (in the `infra` folder) and ARM templates (in the `infrastructure` folder) to set up the necessary Azure resources. \n\n# Multi-Agent Research for RAG\n\nSmartRAG's experimental \"Multi-Agent Research\" feature uses Microsoft's [AutoGen](https://microsoft.github.io/autogen/) framework to create an ensemble of AI agents that collaborate on complex topics:\n\n\u003cimg src=\"assets/agents.png\" width=\"350\"\u003e\n\n## Key Components\n\n1. **Researcher Agent**: Created for each data source, allowing independent research across various indexes.\n2. **Reviewer Agent**: Oversees the process, guiding research and synthesizing findings.\n3. **Collaborative Querying**: Agents ask follow-up questions, reframe queries, and synthesize information from multiple sources.\n\nHere's a snippet of how the reviewer agent works:\n\n```python\ndef create_reviewer_agent(llm_config: Dict[str, Any], list_of_researchers: str, single_data_source: bool = False) -\u003e AssistantAgent:\n    system_message = (\n        \"I am Reviewer. I review the research and drive conclusions. \"\n        \"Once I am done, I will ask you to terminate the conversation.\\n\\n\"\n        \"My job is to ask questions and guide the research to find the information I need. I always ask 10 questions at a time to get the information I need. \"\n        \"and combine it into a final conclusion.\\n\\n\"\n        \"I will make sure to ask follow-up questions to get the full picture.\\n\\n\"\n        \"Only once I have all the information I need, I will ask you to terminate the conversation.\\n\\n\"\n        \"I will keep an eye on the referenced documents, if it looks like not the right documents were referenced, ask the researcher to reframe the question to find additional data sources.\\n\\n\"\n        \"I will use follow-up questions in case you the answer is incomplete (for instance if one data source is missing data).\\n\\n\"\n        \"My researcher is: \" + list_of_researchers + \"\\n\\n\"\n        \"To terminate the conversation, I will write ONLY the string: TERMINATE\"\n    )\n\n    return AssistantAgent(\n        name=\"Reviewer\",\n        llm_config=llm_config,\n        is_termination_msg=lambda msg: \"TERMINATE\" in msg[\"content\"].upper(),\n        system_message=system_message,\n    )\n```\n\n# GraphRAG: Advanced Querying\n\nSmartRAG implements GraphRAG, a powerful approach for complex querying across multiple data sources. This feature allows for more nuanced and comprehensive answers by leveraging graph-based representations of knowledge.\n\n## Key Features of GraphRAG\n\n1. **Global Search**: Perform searches across multiple interconnected data sources.\n2. **Community Context**: Utilize community structures within the data for more relevant results.\n3. **Token-based Processing**: Efficiently manage and process large amounts of text data.\n\nHere's a glimpse of how GraphRAG is implemented:\n\n```python\nasync def global_query(self, query: str):\n    # ... [setup code omitted]\n\n    global_search = GlobalSearch(\n        llm=llm,\n        context_builder=context_builder,\n        token_encoder=token_encoder,\n        max_data_tokens=3000,\n        map_llm_params={\"max_tokens\": 500, \"temperature\": 0.0},\n        reduce_llm_params={\"max_tokens\": 500, \"temperature\": 0.0},\n        context_builder_params={\n            \"use_community_summary\": False,\n            \"shuffle_data\": True,\n            \"include_community_rank\": True,\n            \"min_community_rank\": 0,\n            \"max_tokens\": 3000,\n            \"context_name\": \"Reports\",\n        },\n    )\n\n    result = await global_search.asearch(query=query)\n    # ... [result processing omitted]\n```\n\n# Voice Mode: Natural Conversation Interface\n\nSmartRAG's Voice Mode creates a seamless, conversational interface using Azure OpenAI's Text-to-Speech and Whisper for Speech-to-Text capabilities.\n\n## Key Features\n\n1. **Text-to-Speech**: Convert AI responses to natural-sounding speech.\n2. **Speech-to-Text**: Automatically transcribe user voice input for processing.\n3. **Continuous Listening**: Enable hands-free, natural conversation flow.\n\n# The Foundation: Quality Data and Mature Frameworks\n\u003cimg src=\"assets/screenshot.png?raw=true\"\u003e\n\n## Indexing Quality Improvements\n\n1. **Document Intelligence**: Convert unstructured files to structured Markdown format.\n2. **Multimodal Post-processing**: Additional processing for documents with images or graphs.\n3. **Table Enhancement**: Implement strategies for better handling of table content.\n4. **Page-Level Splitting**: Split documents at page-level during preprocessing for easier citation verification.\n\nHere's an example of how document intelligence is implemented:\n\n```python\ndef convert_pdf_page_to_md(pdf_path: str, page_num: int, output_dir: str, prefix: str, refine_markdown: bool = False) -\u003e str:\n    # ... [initialization code omitted for brevity]\n    \n    # Use Azure's Document Intelligence to convert PDF to Markdown\n    with open(pdf_path, \"rb\") as file:\n        poller = document_intelligence_client.begin_analyze_document(\n            \"prebuilt-layout\", \n            analyze_request=file, \n            output_content_format=ContentFormat.MARKDOWN, \n            content_type=\"application/pdf\"\n        )\n    \n    result = poller.result()\n    markdown_content = result.content\n    \n    # Optional: Refine the Markdown content with additional processing\n    if refine_markdown:\n        png_path = os.path.join(output_dir, f\"{prefix}___Page{page_num+1}.png\")\n        markdown_content = refine_figures(result, png_path)\n        markdown_content = enhance_markdown(markdown_content)\n    \n    # ... [output writing code omitted for brevity]\n```\n\n### Multimodal Post-processing of Images and Graphs\n\nFor documents containing images or graphs, we perform additional postprocessing to improve the generated markdown. We use GPT-4o to generate image captions and inject this information back into the Markdown, allowing users to query not just the text but also the visual content of documents.\n\nHere's an example of how this is implemented:\n\n```python\ndef refine_figures(content, png_path: str) -\u003e str:\n    def process_image(polygon: List[float], pdf_width: float, pdf_height: float, img_width: int, img_height: int) -\u003e str:\n        with Image.open(png_path) as img:\n            # Scale the polygon coordinates to match the PNG dimensions\n            scaled_polygon = [\n                coord * width_scale if i % 2 == 0 else coord * height_scale\n                for i, coord in enumerate(polygon)\n            ]\n            \n            # Crop the image based on the scaled polygon\n            bbox = [\n                min(scaled_polygon[::2]),\n                min(scaled_polygon[1::2]),\n                max(scaled_polygon[::2]),\n                max(scaled_polygon[1::2])\n            ]\n            \n            px_bbox = [int(b) for b in bbox]\n            cropped = img.crop(px_bbox)\n            return get_caption(cropped)  # Generate caption for the cropped image\n\n    # Process each figure in the content\n    for i, figure in enumerate(content.figures):\n        polygon = figure.bounding_regions[0].polygon\n        caption = process_image(polygon, pdf_width, pdf_height, img_width, img_height)\n        \n        # Replace the original figure reference with the new caption\n        figure_pattern = f\"!\\\\[\\\\]\\\\(figures/{i}\\\\)\"\n        replacement = f\"![{caption}](figures/{i})\"\n        \n        updated_content = re.sub(figure_pattern, replacement, updated_content)\n    \n    return updated_content\n```\n\n### Table Enhancement\n\nTables often pose challenges for LLMs. SmartRAG implements strategies such as creating table summaries, generating Q\u0026A pairs about the table content, and optionally creating textual representations of each row.\n\nThe process works similarly to generating image captions.\n\nLet's look at the same Wikipedia page. Without any postprocessing, the extracted markdown looks like this:\n\n```markdown\nDistribution of seats in the Gemeinderat 2022-2026[40]\n\n| :unselected: | SP | :unselected: FDP | :unselected: GPS | :unselected: GLP | :unselected: SVP | :unselected: AL | :unselected: Mitte | :unselected: EVP |\n| - | - | - | - | - | - | - | - | - |\n| | | | | | | | | |\n| 37 | | 22 | 18 | 17 | 14 | 8 | 6 | 3 |\n```\n\nThis may look fine at first glance, but with such data, RAG often fails to find the relevant text chunk during retrieval.\n\n\u003cimg src=\"assets/without_refinements.png\"\u003e\n\nWe can fix that by summarizing the content of the table and adding a set of Q\u0026A.\n\n```markdown\n| :unselected: | SP | :unselected: FDP | :unselected: GPS | :unselected: GLP | :unselected: SVP | :unselected: AL | :unselected: Mitte | :unselected: EVP |\n| - | - | - | - | - | - | - | - | - |\n| | | | | | | | | |\n| 37 | | 22 | 18 | 17 | 14 | 8 | 6 | 3 |\n\n\n\n\u003c!-- Table Summary: This table appears to represent a distribution [...] The most important data points are 37, 22, 18, 17, 14, 8, 6, and 3, which are presumably associated with SP, FDP, GPS, GLP, SVP, AL, Mitte, and EVP, respectively. [...] --\u003e\n\n\n\u003c!-- Q\u0026A Pairs:\nSure, here are 5 question-answer pairs based on the provided table:\n\nQ1: Which party has the highest count in the table?\nA1: The SP party has the highest count at 37.\n\nQ2: What is the count associated with the FDP party?\nA2: The count associated with the FDP party is 22.\n\nQ3: Which party has the smallest allocation according to the table?\nA3: The EVP party has the smallest allocation with a count of 3.\n\n[...]\n--\u003e\n```\n\nThis can help to both synthesize better answers for related questions and find the relevant chunks.\n\n\u003cimg src=\"assets/with_refinements.png\"\u003e\n\nHere's how the implementation looks like (from [table_postprocessor.py](https://github.com/aymenfurter/smartrag/blob/main/app/table_postprocessor.py)):\n\n```python\ndef enhance_table(table_content: str) -\u003e str:\n    enhanced_content = table_content\n    \n    if ENABLE_TABLE_SUMMARY:\n        # Generate a concise summary of the table's content\n        enhanced_content += generate_table_summary(table_content)\n    \n    if ENABLE_ROW_DESCRIPTIONS:\n        # Create natural language descriptions for each row\n        enhanced_content = generate_row_descriptions(enhanced_content)\n    \n    if ENABLE_QA_PAIRS:\n        # Generate potential questions and answers based on the table data\n        enhanced_content += generate_qa_pairs(enhanced_content)\n    \n    return enhanced_content\n\ndef generate_table_summary(table_content: str) -\u003e str:\n    # Use LLM to generate a summary of the table\n    prompt = f\"Summarize the key information in this table:\\n\\n{table_content}\"\n    summary = llm(prompt)\n    return f\"\\n\\n\u003c!-- Table Summary: {summary} --\u003e\\n\"\n\ndef generate_qa_pairs(table_content: str) -\u003e str:\n    # Generate Q\u0026A pairs to enhance understanding of the table\n    prompt = f\"Generate 3-5 question-answer pairs based on this table:\\n\\n{table_content}\"\n    qa_pairs = llm(prompt)\n    return f\"\\n\\n\u003c!-- Q\u0026A Pairs:\\n{qa_pairs}\\n--\u003e\\n\"\n```\n\n# Cloud Architecture\n\nSmartRAG utilizes several key Azure services:\n\n1. Azure OpenAI Service\n2. Ingestion Jobs (Preview)\n3. Document Intelligence (Preview API Version)\n4. Azure AI Search\n5. GPT-4 Vision (GPT-4o)\n6. AutoGen\n\n# Usage\n\n## Basic RAG Query\n\nTo perform a basic RAG query:\n\n1. Upload your documents through the web interface.\n2. Navigate to the \"Chat\" section.\n3. Enter your query in the text box and press send.\n4. The system will retrieve relevant information and generate a response.\n\n## Multi-Agent Research\n\nTo initiate a multi-agent research session:\n\n1. Go to the \"Research\" section.\n2. Enter your complex query or research topic.\n3. Select the data sources you want the agents to use.\n4. Set the maximum research duration.\n5. Click \"Start Research\" to begin the multi-agent process.\n\n## Voice Mode\n\nTo use Voice Mode:\n\n1. Ensure your device has a microphone and speakers.\n2. Navigate to the \"Voice Chat\" section.\n3. Click the microphone icon to start listening.\n4. Speak your query clearly.\n5. The system will process your speech, generate a response, and read it back to you.\n\n# References\n\nSmartRAG builds upon and integrates the following key projects and services:\n- [AutoGen](https://microsoft.github.io/autogen/): A framework for building multi-agent systems, developed by Microsoft. SmartRAG utilizes AutoGen for its multi-agent research capabilities.\n- [GraphRAG](https://github.com/microsoft/graphrag): A graph-based approach to retrieval-augmented generation, created by Microsoft Research. SmartRAG incorporates GraphRAG for advanced querying. \n- [LangChain](https://github.com/hwchase17/langchain): An open-source library for building applications with large language models. SmartRAG uses LangChain for various language model interactions and chain-of-thought processes.\n- [Azure AI Document Intelligence](https://azure.microsoft.com/en-us/products/ai-services/ai-document-intelligence): A cloud-based Azure AI service that uses optical character recognition (OCR) and document understanding AI models. SmartRAG leverages this service for extracting text, structure, and insights from documents.\n- [Azure AI Search](https://azure.microsoft.com/en-us/products/ai-services/ai-search): An AI-powered cloud search service for full-text search, semantic search, and vector search. SmartRAG uses Azure AI Search for efficient information retrieval and indexing.\n- [Azure Container Apps](https://azure.microsoft.com/en-us/products/container-apps): A fully managed serverless container service for building and deploying modern apps at scale. SmartRAG uses Azure Container Apps for deploying and managing its containerized components.\n- [React](https://react.dev/): A JavaScript library for building user interfaces. SmartRAG's frontend is built using React.\n- [Python](https://www.python.org/): The primary programming language used for developing SmartRAG's backend logic and AI integration.\n- [Flask](https://flask.palletsprojects.com/): A lightweight WSGI web application framework in Python. SmartRAG uses Flask for its backend API.\n\n\u003csup\u003e*\u003c/sup\u003e **Note on SmartRAG's Purpose**: SmartRAG is designed as a demonstration and comparison tool for various Retrieval-Augmented Generation (RAG) approaches. It is important to note that SmartRAG is not built for scale and is not intended for production use. \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faymenfurter%2Fsmartrag","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faymenfurter%2Fsmartrag","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faymenfurter%2Fsmartrag/lists"}