{"id":51151350,"url":"https://github.com/soapbucket/mcptest","last_synced_at":"2026-06-26T06:04:50.609Z","repository":{"id":363486793,"uuid":"1237292604","full_name":"soapbucket/mcptest","owner":"soapbucket","description":"The test suite your MCP server is missing. Tool, resource, agent-loop, schema-drift, compliance, and security tests against any Model Context Protocol server, in CI on every commit. One YAML suite, deterministic cassette replay, multi-model comparison, and LLM-judge evals. Rust, Apache-2.0.","archived":false,"fork":false,"pushed_at":"2026-06-24T10:22:56.000Z","size":6833,"stargazers_count":3,"open_issues_count":2,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-24T12:12:30.589Z","etag":null,"topics":["agent-testing","ai-agents","anthropic","compliance","conformance","developer-tools","evals","llm","llm-as-a-judge","llm-as-a-jury","llm-evaluation","llm-testing","mcp","mcp-testing","openai","security","test-automation","testing"],"latest_commit_sha":null,"homepage":"https://mcptest.sh","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/soapbucket.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":"NOTICE","maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-05-13T03:44:52.000Z","updated_at":"2026-06-23T16:24:00.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/soapbucket/mcptest","commit_stats":null,"previous_names":["soapbucket/mcptest"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/soapbucket/mcptest","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soapbucket%2Fmcptest","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soapbucket%2Fmcptest/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soapbucket%2Fmcptest/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soapbucket%2Fmcptest/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/soapbucket","download_url":"https://codeload.github.com/soapbucket/mcptest/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/soapbucket%2Fmcptest/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34805127,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-26T02:00:06.560Z","response_time":106,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent-testing","ai-agents","anthropic","compliance","conformance","developer-tools","evals","llm","llm-as-a-judge","llm-as-a-jury","llm-evaluation","llm-testing","mcp","mcp-testing","openai","security","test-automation","testing"],"created_at":"2026-06-26T06:04:49.698Z","updated_at":"2026-06-26T06:04:50.571Z","avatar_url":"https://github.com/soapbucket.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://mcptest.sh\"\u003e\u003cimg src=\"./docs/assets/mcptest-logo.svg\" alt=\"mcptest\" width=\"220\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n\u003ch1 align=\"center\"\u003emcptest\u003c/h1\u003e\n\n\u003cp align=\"center\"\u003e\u003cstrong\u003eTest your MCP server like you test the rest of your code.\u003c/strong\u003e\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://mcptest.sh\"\u003e\u003cstrong\u003eWebsite\u003c/strong\u003e\u003c/a\u003e ·\n  \u003ca href=\"./docs/index.md\"\u003eDocumentation\u003c/a\u003e ·\n  \u003ca href=\"./examples/\"\u003eExamples\u003c/a\u003e ·\n  \u003ca href=\"https://github.com/soapbucket/mcptest-examples\"\u003eExample servers\u003c/a\u003e\n\u003c/p\u003e\n\n[![CI](https://github.com/soapbucket/mcptest/actions/workflows/ci.yml/badge.svg)](https://github.com/soapbucket/mcptest/actions/workflows/ci.yml)\n[![Crates.io](https://img.shields.io/crates/v/mcptest.svg)](https://crates.io/crates/mcptest)\n[![License](https://img.shields.io/badge/license-Apache--2.0-blue.svg)](./LICENSE)\n[![Rust](https://img.shields.io/badge/rust-1.85%2B-orange.svg)](https://www.rust-lang.org)\n[![GitHub stars](https://img.shields.io/github/stars/soapbucket/mcptest?style=flat)](https://github.com/soapbucket/mcptest/stargazers)\n[![Docs](https://img.shields.io/badge/docs-mcptest.sh-blue.svg)](https://mcptest.sh)\n\nmcptest is an open-source CLI for testing [Model Context Protocol](https://modelcontextprotocol.io)\nservers. You write checks as YAML, point them at any MCP server, and run\nthem from your terminal, your CI, or your coding agent.\n\nA passing unit-test suite tells you your handler returns the right value.\nIt tells you nothing about what your server puts on the wire: whether the\n`initialize` handshake completes, whether the tool catalog still says what\nyou think it says, whether a `tools/call` over stdio or HTTP returns the\nresponse a client will actually see. mcptest speaks MCP end to end and\nchecks exactly that. You get a deterministic pass or fail, and when\nsomething breaks, a structured failure that names the assertion, the\npayload the server sent, and a one-line repro.\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"./docs/assets/flagship.gif\" alt=\"A terminal session: a scaffolded YAML test case, then mcptest run passing it offline against the built-in mock server, then mcptest conformance run with every probed MCP spec requirement passing\" width=\"820\"\u003e\n\u003c/p\u003e\n\n## Try it in three commands\n\n```sh\ncurl -fsSL https://download.mcptest.sh/install.sh | sh   # or: brew install soapbucket/tap/mcptest\nmcptest init   # writes a starter suite under tests/\nmcptest run    # deterministic verdicts, structured failures\n```\n\n`mcptest init` scaffolds a suite that targets a built-in mock server\n(`mcptest mock`), so the first `mcptest run` passes offline with no real\nserver and no network. Swap the `command:` for your own server when you\nare ready. If an MCP client on your machine already knows your server,\n`mcptest init --from-discovered \u003cname\u003e` scaffolds against it instead.\n\n## How it works\n\nYou describe the contract in YAML: a server, a call, and what the\nresponse should look like.\n\n```yaml\n# mcptest.yml\n# yaml-language-server: $schema=https://mcptest.sh/schema/v1.json\nservers:\n  api:\n    command: [\"./your-server\"]      # or: url: https://example.com/mcp\ntools:\n  - name: search returns a result for a known query\n    server: api\n    tool: search\n    args: { query: \"anthropic\" }\n    expect:\n      - target: result.content[0].text\n        matcher: { contains: \"results for\" }\n```\n\n`mcptest run` starts your server (or connects to its URL), performs the\nMCP `initialize` handshake over stdio, streamable HTTP, or legacy SSE,\nmakes the call a real client would make, and checks the response against\nyour assertions. It prints one line per check and exits `0` when\neverything holds, `1` when something breaks.\n\n```\npush  -\u003e  mcptest run  -\u003e  assert on the wire  -\u003e  exit 0 / exit 1\n```\n\nWhen the server drifts, the same suite catches it. The failure names the\nassertion, shows what it expected against what the server sent, and exits\nnon-zero so CI can gate on it.\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"./docs/assets/gate-flips.gif\" alt=\"A passing mcptest suite, then the server's search response drifts to an upstream error, then the same suite fails with the assertion named, expected against actual, and exit 1\" width=\"820\"\u003e\n\u003c/p\u003e\n\n## One binary, the whole surface\n\nThe same engine covers the things teams otherwise test with one-off\nscripts. Each is a YAML block or a subcommand, and each exits with a code\nCI understands.\n\n- **Tools, resources, prompts.** Assert on real responses; catch catalog\n  and input-schema drift.\n- **The agent loop.** Drive a real model across one or more servers and\n  assert on the trace it produces (tool choice, arguments, tokens, cost).\n- **Spec conformance.** Grade a server against a pinned MCP protocol\n  version (`mcptest conformance run`).\n- **Schema drift.** Diff the tool catalog against a baseline and classify\n  each change as breaking or not (`mcptest diff`).\n- **Security.** Scan tool, prompt, and resource definitions for\n  injection, exfiltration, and shadowing, and report as SARIF\n  (`mcptest security`).\n- **Offline replay.** Record real exchanges to cassettes and replay them\n  in CI with no keys and no spend.\n\n## Test the agent loop, replay it offline\n\nPoint one YAML test at one model or a list of them. mcptest lists the\ntools on every server you name, sends the prompt to the model with that\ncatalog attached, dispatches the tool calls the model makes, and records\nthe conversation. Your assertions resolve against the trace, so the same\nsuite checks that the model picked the right tool and that the run stayed\ninside a token budget.\n\n```yaml\nagents:\n  - name: weather query routes to get_weather\n    models: [claude-sonnet-4-5, gpt-5, gemini-2.5-pro]\n    servers: [weather]\n    prompt: What is the weather in Sacramento?\n    expect:\n      - target: tool_calls[0].name\n        matcher: { exact: get_weather }\n      - target: conversation.tokens.total\n        matcher: { regex: \"^[0-9]+$\" }\n```\n\nRecord once with your provider keys, and each `(test, model)` pair gets\nits own cassette. After that a plain `mcptest run` replays them in CI,\ndeterministically, without spending a cent. Add a model identifier to\n`models:`, re-record, and the report tells you which assertion broke for\nwhich model.\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"./docs/assets/agent-loop.gif\" alt=\"A mcptest agent test asserting that a model calls the greet tool exactly once, then mcptest run replaying a recorded Claude run offline from a committed cassette and passing\" width=\"820\"\u003e\n\u003c/p\u003e\n\nProviders covered today: Anthropic, OpenAI (including the o-series),\nGoogle Gemini, Mistral, plus any OpenAI-compatible endpoint (Azure,\nOpenRouter, vLLM, llama.cpp, LiteLLM, Together, Groq, Bedrock-fronted\nAnthropic) through a named `providers:` block. Sweep a whole suite across\nmodels with `mcptest run --models a,b,c` and get a test-by-model grid.\nBackground is in [docs/models.md](./docs/models.md).\n\n## Use mcptest from your coding agent\n\nmcptest ships an MCP server of its own, so Claude Code, Cursor, or any\nMCP-capable agent can run the full testing loop. Two commands hand it the\nkeys:\n\n```sh\nmcptest mcp-server --install --enable-writes   # the agent-facing verbs\nmcptest skill --install                        # the packaged skill\n```\n\nThe agent scaffolds a validated suite from the server's real catalog,\nsharpens the generic checks against observed responses, runs the suite,\nand reads back a failure that already carries the assertion, the actual\nvalue, and a one-line repro. The agent brings the judgment; mcptest\nsupplies the deterministic ground truth it cannot invent, and the YAML it\nleaves behind is the diffable audit trail a human reviews. See\n[the agent interface](./docs/agent-interface.md) for the verb reference\nand the model-facing `--reporter agent` output.\n\n## Run it in CI\n\nThe suite is a diffable YAML file you run on every commit. Reporters cover\nthe formats CI already understands, and a single run writes\nmachine-readable artifacts CI can store.\n\n```yaml\n# .github/workflows/mcptest.yml\n- name: Install mcptest\n  run: curl -fsSL https://download.mcptest.sh/install.sh | MCPTEST_VERSION=v1.1.0 sh\n- name: Run mcptest\n  run: mcptest run --reporter junit --output mcptest.junit.xml\n```\n\nPick the reporter with `--reporter`: `pretty` (default), `minimal`,\n`json`, `junit`, `md`, `html`, `sarif`, `gitlab`, `ndjson`, `tap`,\n`matrix`, or `quiet`. Capture the JSON envelope once, then re-render it\ninto any other format with `mcptest report --format`, with no second run\nand no second API call.\n\n## Why mcptest\n\nInspectors and one-off scripts tell you a server looked right once. A\ngeneral eval framework grades the model that calls your tools, not the\nserver on the other side of the call. mcptest is the part you can commit:\none binary that checks the protocol contract, the behavior, the agent\nloop, schema drift, and tool-definition security, and turns each into a\nstable exit code. It is a single static binary with no telemetry and no\nauto-update, it is Apache-2.0, and it bakes a CycloneDX Software Bill of\nMaterials into the binary so you can read the dependency list from the\ncopy you already have.\n\n```sh\nmcptest sbom            # the embedded CycloneDX SBOM\nmcptest sbom --verify   # re-hash it to catch tampering\n```\n\nEvery release is Sigstore-signed and carries SLSA L3 build provenance.\nThe full verification walkthrough lives at\n[mcptest.sh/trust](https://mcptest.sh/trust).\n\n## Install\n\nHomebrew (macOS, Linux):\n\n```sh\nbrew install soapbucket/tap/mcptest\n```\n\ncurl installer (macOS, Linux, Apple Silicon and arm64 included):\n\n```sh\ncurl -fsSL https://download.mcptest.sh/install.sh | sh\n```\n\nThe installer detects your platform, downloads the signed release tarball\nfrom `download.mcptest.sh`, verifies its sha256 against the sums file, and\ndrops `mcptest` into `~/.local/bin` (or `/usr/local/bin` under sudo).\nInspect it first with `curl -fsSL https://download.mcptest.sh/install.sh | less`.\n\nDocker:\n\n```sh\ndocker run --rm -v \"$PWD\":/work -w /work soapbucket/mcptest:latest run\n```\n\n## Documentation\n\nFull documentation lives under [`docs/`](./docs/index.md). Start here:\n\n- [Getting started](./docs/getting-started.md): install to first passing\n  test in about five minutes.\n- [What is mcptest](./docs/what-is-mcptest.md): the one-page definition.\n- [Concepts](./docs/concepts.md): the mental model.\n- [YAML reference](./docs/yaml-reference.md): every field, every matcher.\n- [CLI reference](./docs/cli-reference.md): every subcommand, every flag.\n- [Examples](./examples/README.md): runnable suites across the whole\n  surface, plus [mcptest-examples](https://github.com/soapbucket/mcptest-examples)\n  for complete end-to-end suites against ten popular servers.\n\nSDKs drive mcptest from your own test runner: Python (pytest), TypeScript\n(vitest, jest, mocha, `node:test`), Go, Rust (proc-macro), .NET (xUnit),\nand JVM (JUnit 5). See [docs/sdks.md](./docs/sdks.md).\n\n## Build from source\n\n```sh\ncargo build --release\n./target/release/mcptest --help\n./scripts/check.sh        # the full gate: fmt + clippy + doc + build + test\n```\n\n## License\n\nApache-2.0. See [LICENSE](./LICENSE) and [NOTICE](./NOTICE).\n\nCopyright 2026 Soap Bucket LLC and the mcptest contributors. Soap Bucket\nLLC at [soapbucket.com](https://soapbucket.com).\n\n## Links\n\n- Documentation: [`docs/index.md`](./docs/index.md)\n- Releases: [github.com/soapbucket/mcptest/releases](https://github.com/soapbucket/mcptest/releases)\n- Issues and roadmap: [github.com/soapbucket/mcptest/issues](https://github.com/soapbucket/mcptest/issues)\n- X (Twitter): [@soapbucket](https://x.com/soapbucket)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsoapbucket%2Fmcptest","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsoapbucket%2Fmcptest","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsoapbucket%2Fmcptest/lists"}