{"id":30868732,"url":"https://github.com/andrewginns/agents-mcp-usage","last_synced_at":"2025-12-25T07:56:47.063Z","repository":{"id":289461941,"uuid":"971329910","full_name":"andrewginns/agents-mcp-usage","owner":"andrewginns","description":"Demonstrate Agentic use of Model Context Protocol (MCP) server tools with several Agent Frameworks","archived":false,"fork":false,"pushed_at":"2025-07-03T09:00:41.000Z","size":6590,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-07-03T10:19:36.022Z","etag":null,"topics":["adk-python","agents","agents-sdk","evals","evaluation","gemini","langgraph","llm","logfire","mcp","mcp-server","openai","pydantic-ai","streamlit","tool"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/andrewginns.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-04-23T11:06:42.000Z","updated_at":"2025-07-02T09:24:44.000Z","dependencies_parsed_at":"2025-06-16T17:38:39.435Z","dependency_job_id":"6d8e39c3-e18c-4e48-9e8f-ad9c4b215170","html_url":"https://github.com/andrewginns/agents-mcp-usage","commit_stats":null,"previous_names":["andrewginns/agents-mcp-usage"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/andrewginns/agents-mcp-usage","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andrewginns%2Fagents-mcp-usage","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andrewginns%2Fagents-mcp-usage/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andrewginns%2Fagents-mcp-usage/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andrewginns%2Fagents-mcp-usage/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/andrewginns","download_url":"https://codeload.github.com/andrewginns/agents-mcp-usage/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andrewginns%2Fagents-mcp-usage/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":274102838,"owners_count":25222677,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-07T02:00:09.463Z","response_time":67,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["adk-python","agents","agents-sdk","evals","evaluation","gemini","langgraph","llm","logfire","mcp","mcp-server","openai","pydantic-ai","streamlit","tool"],"created_at":"2025-09-07T22:03:53.718Z","updated_at":"2025-12-25T07:56:47.056Z","avatar_url":"https://github.com/andrewginns.png","language":"Python","funding_links":[],"categories":["AI/ML"],"sub_categories":[],"readme":"# Model Context Protocol (MCP) Agent Frameworks Demo \u0026 Benchmarking Platform\n\nThis repository demonstrates LLM Agents using tools from Model Context Protocol (MCP) servers with several frameworks:\n- Google Agent Development Kit (ADK)\n- LangGraph Agents\n- OpenAI Agents\n- Pydantic-AI Agents\n\n## Repository Structure\n\n- [Agent with a single MCP server](agents_mcp_usage/basic_mcp/README.md) - Learning examples and basic patterns\n- [Agent with multiple MCP servers](agents_mcp_usage/multi_mcp/README.md) - Advanced usage with MCP server coordination\n- [Evaluation suite](agents_mcp_usage/evaluations/mermaid_evals/README.md) - Comprehensive benchmarking tools\n  - **Evaluation Dashboard**: Interactive Streamlit UI for model comparison\n  - **Multi-Model Benchmarking**: Parallel/sequential evaluation across multiple LLMs\n  - **Rich Metrics**: Usage analysis, cost comparison, and performance leaderboards\n\nThe repo also includes Python MCP Servers:\n- [`example_server.py`](mcp_servers/example_server.py) based on [MCP Python SDK Quickstart](https://github.com/modelcontextprotocol/python-sdk/blob/b4c7db6a50a5c88bae1db5c1f7fba44d16eebc6e/README.md?plain=1#L104) - Modified to include a datetime tool and run as a server invoked by Agents\n- [`mermaid_validator.py`](mcp_servers/mermaid_validator.py) - Mermaid diagram validation server using mermaid-cli\n\nTracing is done through Pydantic Logfire.\n\n![MCP Concept](docs/images/mcp_concept.png)\n\n# Quickstart\n\n`cp .env.example .env`\n- Add `GEMINI_API_KEY` and/or `OPENAI_API_KEY`\n  - Individual scripts can be adjusted to use models from any provider supported by the specific framework\n    - By default only [basic_mcp_use/oai-agent_mcp.py](agents_mcp_usage/basic_mcp/basic_mcp_use/oai-agent_mcp.py) requires `OPENAI_API_KEY`\n    - All other scripts require `GEMINI_API_KEY` (Free tier key can be created at https://aistudio.google.com/apikey)\n- [Optional] Add `LOGFIRE_TOKEN` to visualise evaluations in Logfire web ui\n\nRun an Agent framework script e.g.:\n- `uv run agents_mcp_usage/basic_mcp/basic_mcp_use/pydantic_mcp.py`\n  - Requires `GEMINI_API_KEY` by default\n\n- `uv run agents_mcp_usage/basic_mcp/basic_mcp_use/oai-agent_mcp.py`\n  - Requires `OPENAI_API_KEY` by default\n\n- Launch the ADK web UI for visual interaction with the agents:\n  - `make adk_basic_ui`\n  \nCheck console, Logfire, or the ADK web UI for output\n\n## Project Overview\n\nThis project aims to teach:\n1. How to use MCP with multiple LLM Agent frameworks\n    - Agent using a single MCP server ([basic_mcp](#basic-mcp-single-server-usage))\n    - Agent using multiple MCP servers ([multi_mcp](#multi-mcp-advanced-usage))\n2. How to see traces LLM Agents with Logfire\n3. How to evaluate LLMs with PydanticAI evals\n\n![Logfire UI](docs/images/logfire_ui.png)\n\n## Repository Structure\n\n- **[agents_mcp_usage/basic_mcp/](agents_mcp_usage/basic_mcp/)** - Single MCP server integration examples\n  - **basic_mcp_use/** - Contains basic examples of single MCP usage:\n    - `adk_mcp.py` - Example of using MCP with Google's Agent Development Kit (ADK 1.3.0)\n    - `langgraph_mcp.py` - Example of using MCP with LangGraph\n    - `oai-agent_mcp.py` - Example of using MCP with OpenAI Agents\n    - `pydantic_mcp.py` - Example of using MCP with Pydantic-AI\n\n\n- **[agents_mcp_usage/multi_mcp/](agents_mcp_usage/multi_mcp/)** - Advanced multi-MCP server integration examples\n  - **multi_mcp_use/** - Contains examples of using multiple MCP servers simultaneously:\n    - `pydantic_mcp.py` - Example of using multiple MCP servers with Pydantic-AI Agent\n\n- **[agents_mcp_usage/evaluations/](agents_mcp_usage/evaluations/)** - Evaluation modules for benchmarking\n  - **mermaid_evals/** - Comprehensive evaluation suite for mermaid diagram fixing tasks\n    - `evals_pydantic_mcp.py` - Core evaluation module for single-model testing\n    - `run_multi_evals.py` - Multi-model benchmarking with parallel execution\n    - `merbench_ui.py` - Interactive dashboard for result visualization\n\n- **Demo Python MCP Servers**\n  - `mcp_servers/example_server.py` - Simple MCP server that runs locally, implemented in Python\n  - `mcp_servers/mermaid_validator.py` - Mermaid diagram validation MCP server, implemented in Python\n\n## Basic MCP: Single Server Usage\n\nThe `basic_mcp` directory demonstrates how to integrate a single MCP server with different agent frameworks. Each example follows a similar pattern:\n\n1. **Environment Setup**: Loading environment variables and configuring logging\n2. **Server Connection**: Establishing a connection to the local MCP server\n3. **Agent Configuration**: Setting up an agent with the appropriate model\n4. **Execution**: Running the agent with a query and handling the response\n\nThe MCP server in these examples provides:\n- An addition tool (`add(a, b)`)\n- A time tool (`get_current_time()`) \n- A dynamic greeting resource (`greeting://{name}`)\n\n### Basic MCP Architecture\n\n```mermaid\ngraph LR\n    User((User)) --\u003e |\"Run script\u003cbr\u003e(e.g., pydantic_mcp.py)\"| Agent\n\n    subgraph \"Agent Frameworks\"\n        Agent[Agent]\n        ADK[\"Google ADK\u003cbr\u003e(adk_mcp.py)\"]\n        LG[\"LangGraph\u003cbr\u003e(langgraph_mcp.py)\"]\n        OAI[\"OpenAI Agents\u003cbr\u003e(oai-agent_mcp.py)\"]\n        PYD[\"Pydantic-AI\u003cbr\u003e(pydantic_mcp.py)\"]\n        \n        Agent --\u003e ADK\n        Agent --\u003e LG\n        Agent --\u003e OAI\n        Agent --\u003e PYD\n    end\n\n    subgraph \"Python MCP Server\"\n        MCP[\"Model Context Protocol Server\u003cbr\u003e(mcp_servers/example_server.py)\"]\n        Tools[\"Tools\u003cbr\u003e- add(a, b)\u003cbr\u003e- get_current_time()\"]\n        Resources[\"Resources\u003cbr\u003e- greeting://{name}\"]\n        MCP --- Tools\n        MCP --- Resources\n    end\n\n    subgraph \"LLM Providers\"\n        OAI_LLM[\"OpenAI Models\"]\n        GEM[\"Google Gemini Models\"]\n        OTHER[\"Other LLM Providers...\"]\n    end\n    \n    Logfire[(\"Logfire\u003cbr\u003eTracing\")]\n    \n    ADK --\u003e MCP\n    LG --\u003e MCP\n    OAI --\u003e MCP\n    PYD --\u003e MCP\n    \n    MCP --\u003e OAI_LLM\n    MCP --\u003e GEM\n    MCP --\u003e OTHER\n    \n    ADK --\u003e Logfire\n    LG --\u003e Logfire\n    OAI --\u003e Logfire\n    PYD --\u003e Logfire\n    \n    LLM_Response[(\"Response\")] --\u003e User\n    OAI_LLM --\u003e LLM_Response\n    GEM --\u003e LLM_Response\n    OTHER --\u003e LLM_Response\n```\n\n#### Try the Basic MCP Examples:\n\n```bash\n# Google ADK example\nuv run agents_mcp_usage/basic_mcp/basic_mcp_use/adk_mcp.py\n\n# LangGraph example\nuv run agents_mcp_usage/basic_mcp/basic_mcp_use/langgraph_mcp.py\n\n# OpenAI Agents example\nuv run agents_mcp_usage/basic_mcp/basic_mcp_use/oai-agent_mcp.py\n\n# Pydantic-AI example\nuv run agents_mcp_usage/basic_mcp/basic_mcp_use/pydantic_mcp.py\n\n# Launch ADK web UI for visual interaction\nmake adk_basic_ui\n```\n\nMore details on basic MCP implementation can be found in the [basic_mcp README](agents_mcp_usage/basic_mcp/README.md).\n\n## Multi-MCP: Advanced Usage\n\nThe `multi_mcp` directory demonstrates advanced techniques for connecting to and coordinating between multiple specialised MCP servers simultaneously. This approach offers several advantages:\n\n1. **Domain Separation**: Each MCP server can focus on a specific domain or set of capabilities\n2. **Modularity**: Add, remove, or update capabilities without disrupting the entire system\n3. **Scalability**: Distribute load across multiple servers for better performance\n4. **Specialisation**: Optimise each MCP server for its specific use case\n\n### Multi-MCP Architecture\n\n```mermaid\ngraph LR\n    User((User)) --\u003e |\"Run script\u003cbr\u003e(e.g., pydantic_mcp.py)\"| Agent\n\n    subgraph \"Agent Framework\"\n        Agent[\"Pydantic-AI Agent\u003cbr\u003e(pydantic_mcp.py)\"]\n    end\n\n    subgraph \"MCP Servers\"\n        PythonMCP[\"Python MCP Server\u003cbr\u003e(mcp_servers/example_server.py)\"]\n        MermaidMCP[\"Python Mermaid MCP Server\u003cbr\u003e(mcp_servers/mermaid_validator.py)\"]\n        \n        Tools[\"Tools\u003cbr\u003e- add(a, b)\u003cbr\u003e- get_current_time()\"]\n        Resources[\"Resources\u003cbr\u003e- greeting://{name}\"]\n        MermaidValidator[\"Mermaid Diagram\u003cbr\u003eValidation Tools\"]\n        \n        PythonMCP --- Tools\n        PythonMCP --- Resources\n        MermaidMCP --- MermaidValidator\n    end\n\n    subgraph \"LLM Providers\"\n        LLMs[\"PydanticAI LLM call\"]\n    end\n    \n    Logfire[(\"Logfire\u003cbr\u003eTracing\")]\n    \n    Agent --\u003e PythonMCP\n    Agent --\u003e MermaidMCP\n    \n    PythonMCP --\u003e LLMs\n    MermaidMCP --\u003e LLMs\n    \n    Agent --\u003e Logfire\n    \n    LLM_Response[(\"Response\")] --\u003e User\n    LLMs --\u003e LLM_Response\n```\n\n#### Try the Multi-MCP Examples:\n\n```bash\n# Run the Pydantic-AI multi-MCP example\nuv run agents_mcp_usage/multi_mcp/multi_mcp_use/pydantic_mcp.py\n\n# Run the multi-MCP evaluation\nuv run agents_mcp_usage/evaluations/mermaid_evals/evals_pydantic_mcp.py\n\n# Run multi-model benchmarking\nuv run agents_mcp_usage/evaluations/mermaid_evals/run_multi_evals.py --models \"gemini-2.5-pro-preview-06-05,gemini-2.0-flash\" --runs 5 --parallel\n\n# Launch the evaluation dashboard\nuv run streamlit run agents_mcp_usage/evaluations/mermaid_evals/merbench_ui.py\n```\n\nMore details on multi-MCP implementation can be found in the [multi_mcp README](agents_mcp_usage/multi_mcp/README.md).\n\n## Evaluation Suite \u0026 Benchmarking Dashboard\n\nThis repository includes a comprehensive evaluation system for benchmarking LLM agent performance across multiple frameworks and models. The evaluation suite tests agents on mermaid diagram correction tasks using multiple MCP servers, providing rich metrics and analysis capabilities.\n\n### Key Evaluation Features\n\n- **Multi-Level Difficulty**: Easy, medium, and hard test cases for comprehensive assessment\n- **Multi-Model Benchmarking**: Parallel or sequential evaluation across multiple LLM models\n- **Interactive Dashboard**: Streamlit-based UI for visualising results, cost analysis, and model comparison\n- **Rich Metrics Collection**: Token usage, cost analysis, success rates, and failure categorisation\n- **Robust Error Handling**: Comprehensive retry logic and detailed failure analysis\n- **Export Capabilities**: CSV results for downstream analysis and reporting\n\n### Dashboard Features\n\nThe included Streamlit dashboard (`merbench_ui.py`) provides:\n\n- **Model Leaderboards**: Performance rankings by accuracy, cost efficiency, and speed\n- **Cost Analysis**: Detailed cost breakdowns and cost-per-success metrics\n- **Failure Analysis**: Categorised failure reasons with debugging insights\n- **Performance Trends**: Visualisation of model behaviour across difficulty levels\n- **Resource Usage**: Token consumption and API call patterns\n- **Comparative Analysis**: Side-by-side model performance comparison\n\n### Quick Evaluation Commands\n\n```bash\n# Single model evaluation\nuv run agents_mcp_usage/evaluations/mermaid_evals/evals_pydantic_mcp.py\n\n# Multi-model parallel benchmarking\nuv run agents_mcp_usage/evaluations/mermaid_evals/run_multi_evals.py \\\n  --models \"gemini-2.5-pro-preview-06-05,gemini-2.0-flash,gemini-2.5-flash\" \\\n  --runs 5 \\\n  --parallel \\\n  --output-dir ./results\n\n# Launch interactive dashboard\nuv run streamlit run agents_mcp_usage/evaluations/mermaid_evals/merbench_ui.py\n```\n\nThe evaluation system enables robust, repeatable benchmarking across LLM models and agent frameworks, supporting both research and production model selection decisions.\n\n## What is MCP?\n\nThe Model Context Protocol allows applications to provide context for LLMs in a standardised way, separating the concerns of providing context from the actual LLM interaction.\n\nLearn more: https://modelcontextprotocol.io/introduction\n\n## Why MCP\n\nBy defining clear specifications for components like resources (data exposure), prompts (reusable templates), tools (actions), and sampling (completions), MCP simplifies the development process and fosters consistency.\n\nA key advantage highlighted is flexibility; MCP allows developers to more easily switch between different LLM providers without needing to completely overhaul their tool and data integrations. It provides a structured approach, potentially reducing the complexity often associated with custom tool implementations for different models. While frameworks like Google Agent Development Kit, LangGraph, OpenAI Agents, or libraries like PydanticAI facilitate agent building, MCP focuses specifically on standardising the interface between the agent's reasoning (the LLM) and its capabilities (tools and data), aiming to create a more interoperable ecosystem.\n\n## Setup Instructions\n\n1. Clone this repository\n2. Install required packages:\n   ```bash\n   make install\n   ```\n\n   To use the ADK web UI, run:\n   ```bash\n   make adk_basic_ui\n   ```\n3. Set up your environment variables in a `.env` file:\n   ```\n   LOGFIRE_TOKEN=your_logfire_token\n   GEMINI_API_KEY=your_gemini_api_key\n   OPENAI_API_KEY=your_openai_api_key\n   ```\n4. Run any of the sample scripts as shown in the examples above\n\n## About Logfire\n\n[Logfire](https://github.com/pydantic/logfire) is an observability platform from the team behind Pydantic that makes monitoring AI applications straightforward. Features include:\n\n- Simple yet powerful dashboard\n- Python-centric insights, including rich display of Python objects\n- SQL-based querying of your application data\n- OpenTelemetry support for leveraging existing tooling\n- Pydantic integration for analytics on validations\n\nLogfire gives you visibility into how your code is running, which is especially valuable for LLM applications where understanding model behaviour is critical.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fandrewginns%2Fagents-mcp-usage","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fandrewginns%2Fagents-mcp-usage","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fandrewginns%2Fagents-mcp-usage/lists"}