{"id":47664520,"url":"https://github.com/us-all/datadog-mcp-server","last_synced_at":"2026-05-15T04:01:56.659Z","repository":{"id":341521756,"uuid":"1167390150","full_name":"us-all/datadog-mcp-server","owner":"us-all","description":"Datadog MCP server — 159 tools for metrics, monitors, logs, APM, RUM, synthetics, incidents, fleet, status pages, and more. Read-only by default.","archived":false,"fork":false,"pushed_at":"2026-05-03T01:46:53.000Z","size":379,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-03T03:36:38.023Z","etag":null,"topics":["apm","claude","claude-code","datadog","mcp","model-context-protocol","monitoring","observability","rum"],"latest_commit_sha":null,"homepage":"https://www.npmjs.com/package/@us-all/datadog-mcp","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/us-all.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-26T08:41:19.000Z","updated_at":"2026-05-03T01:46:56.000Z","dependencies_parsed_at":"2026-04-10T09:02:37.958Z","dependency_job_id":null,"html_url":"https://github.com/us-all/datadog-mcp-server","commit_stats":null,"previous_names":["us-all/datadog-mcp-server"],"tags_count":29,"template":false,"template_full_name":null,"purl":"pkg:github/us-all/datadog-mcp-server","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/us-all%2Fdatadog-mcp-server","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/us-all%2Fdatadog-mcp-server/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/us-all%2Fdatadog-mcp-server/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/us-all%2Fdatadog-mcp-server/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/us-all","download_url":"https://codeload.github.com/us-all/datadog-mcp-server/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/us-all%2Fdatadog-mcp-server/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33053144,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-13T13:14:54.681Z","status":"online","status_checked_at":"2026-05-15T02:00:06.351Z","response_time":103,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["apm","claude","claude-code","datadog","mcp","model-context-protocol","monitoring","observability","rum"],"created_at":"2026-04-02T11:51:39.750Z","updated_at":"2026-05-15T04:01:56.653Z","avatar_url":"https://github.com/us-all.png","language":"TypeScript","funding_links":[],"categories":["📡 Monitoring \u0026 Observability","📊 Monitoring"],"sub_categories":[],"readme":"# Datadog MCP Server\n\n\u003e **The Datadog MCP that answers _\"why is this happening?\"_ — not just _\"what's the value?\"_**\n\u003e\n\u003e Aggregation tools that fold 5–7 sequential API calls into one structured response. Full SLO CRUD. Fleet automation. The widest Datadog API coverage in any MCP — **165 tools** built on the [@us-all MCP standard](https://github.com/us-all/mcp-toolkit/blob/main/STANDARD.md).\n\n[![npm](https://img.shields.io/npm/v/@us-all/datadog-mcp)](https://www.npmjs.com/package/@us-all/datadog-mcp)\n[![downloads](https://img.shields.io/npm/dm/@us-all/datadog-mcp)](https://www.npmjs.com/package/@us-all/datadog-mcp)\n[![tools](https://img.shields.io/badge/tools-165-blue)](#full-tool-reference)\n[![@us-all standard](https://img.shields.io/badge/built%20to-%40us--all%20MCP%20standard-blue)](https://github.com/us-all/mcp-toolkit/blob/main/STANDARD.md)\n[![Glama MCP server](https://glama.ai/mcp/servers/us-all/datadog-mcp-server/badges/score.svg)](https://glama.ai/mcp/servers/us-all/datadog-mcp-server)\n\n## What it does that others don't\n\n- **Aggregation tools** — `analyze-monitor-state` and `slo-compliance-snapshot` collapse 5–7 sequential API calls into one structured response with a `caveats` array for partial failures. No other Datadog MCP ships this pattern.\n- **Full SLO CRUD** — create, update, delete SLOs (and their corrections). The official Bits AI MCP and community alternatives are read-only on SLOs.\n- **Fleet Automation** — 17 tools across deployments, schedules, and instrumented pods. Only this server.\n- **Status Pages** — 21 tools for full status-page lifecycle (components, degradations, maintenances). Only this server.\n- **Token-efficient by design** — `extractFields` projection, `DD_TOOLS`/`DD_DISABLE` 16-category toggles, and a `search-tools` meta-tool keep LLM context low across 165 tools.\n- **Apps SDK card** — `slo-compliance-snapshot` renders as a visual card on ChatGPT clients via `_meta[\"openai/outputTemplate\"]`. Claude clients receive the same JSON content (non-breaking).\n- **stdio + Streamable HTTP** — defaults to stdio (Claude Desktop / Code). Set `MCP_TRANSPORT=http` for ChatGPT Apps SDK or remote clients (Bearer auth via `MCP_HTTP_TOKEN`).\n\n## Try this — 5 prompts\n\nConnect the server to Claude Desktop or Claude Code, then paste any of these:\n\n1. **SLO health** — *\"List my SLOs and their error budget remaining this month. Group by status: compliant, at-risk, breached.\"*\n2. **Incident triage** — *\"There's an active incident on `checkout-service`. Pull the linked monitors, the recent error spikes from APM, and which deployments touched the service in the last 24h.\"*\n3. **Monitor noise audit** — *\"Find monitors that alerted more than 10 times in the last 7 days but had MTTR under 5 minutes — these are probably flapping.\"*\n4. **RUM error spike** — *\"RUM error rate jumped on the checkout funnel between 14:00 and 14:30 today. Show me the top error groups, affected sessions, and the user actions before the errors.\"*\n5. **Fleet rollout** — *\"Schedule the `datadog-agent` 7.55.0 rollout to the `staging` cluster, weekends only, starting next Saturday.\"*\n\n## When to use this vs Datadog's official MCP\n\nDatadog's official MCP (Bits AI MCP, GA 2026-03-09) is **complementary**, not a replacement:\n\n| | Official Datadog MCP | `@us-all/datadog-mcp` (this) |\n|--|----------------------|------------------------------|\n| Tool count | 16+ core toolsets | **165 tools** across full API surface |\n| Deployment | Remote (managed by Datadog) | **Self-host** stdio (npx / Docker / npm) |\n| Auth | Datadog SSO | API + APP key |\n| Sites | Public Datadog sites | **Any site, incl. internal/sovereign**; US5 default |\n| SLO writes | ❌ | ✅ create/update/delete SLOs + corrections |\n| Fleet automation | ❌ | ✅ 17 tools |\n| Status pages | ❌ | ✅ 21 tools |\n| Aggregation tools | ❌ | ✅ `analyze-monitor-state`, `slo-compliance-snapshot` |\n| MCP Prompts | ❌ | ✅ 4 (`triage-incident`, `audit-monitor-noise`, `analyze-rum-error-spike`, `investigate-slow-trace`) |\n| MCP Resources | ❌ | ✅ `dd://service/{serviceName}`, `dd://team/{teamId}`, `dd://synthetics/{testId}`, etc. |\n\nUse the official Bits AI MCP for fast managed onboarding and SSO. Use this when you need full API coverage, SLO/fleet/status-page write parity, or self-hosting (internal sites, isolated networks, dev/CI sandboxes).\n\n## Install\n\n### Claude Desktop\n\nAdd to `~/Library/Application Support/Claude/claude_desktop_config.json`:\n\n```json\n{\n  \"mcpServers\": {\n    \"datadog\": {\n      \"command\": \"npx\",\n      \"args\": [\"-y\", \"@us-all/datadog-mcp\"],\n      \"env\": {\n        \"DD_API_KEY\": \"\u003cyour-api-key\u003e\",\n        \"DD_APP_KEY\": \"\u003cyour-app-key\u003e\",\n        \"DD_SITE\": \"datadoghq.com\"\n      }\n    }\n  }\n}\n```\n\n### Claude Code\n\n```bash\nclaude mcp add datadog -s user \\\n  -e DD_API_KEY=\u003cyour-api-key\u003e -e DD_APP_KEY=\u003cyour-app-key\u003e -e DD_SITE=datadoghq.com \\\n  -- npx -y @us-all/datadog-mcp\n```\n\n### Docker\n\n```bash\ndocker run -e DD_API_KEY=... -e DD_APP_KEY=... -e DD_SITE=datadoghq.com \\\n  ghcr.io/us-all/datadog-mcp-server:latest\n```\n\n### Build from source\n\n```bash\ngit clone https://github.com/us-all/datadog-mcp-server.git\ncd datadog-mcp-server \u0026\u0026 pnpm install \u0026\u0026 pnpm build\nnode dist/index.js\n```\n\n## Configuration\n\n| Variable | Required | Default | Description |\n|----------|----------|---------|-------------|\n| `DD_API_KEY` | ✅ | — | Datadog API key |\n| `DD_APP_KEY` | ✅ | — | Datadog Application key |\n| `DD_SITE` | ❌ | `us5.datadoghq.com` | Datadog site (see table below) |\n| `DD_ALLOW_WRITE` | ❌ | `false` | Set `true` to enable mutations (create/update/delete) |\n| `DD_TOOLS` | ❌ | — | Comma-sep allowlist of categories. Only these load — biggest token saver. |\n| `DD_DISABLE` | ❌ | — | Comma-sep denylist. Ignored when `DD_TOOLS` is set. |\n| `MCP_TRANSPORT` | ❌ | `stdio` | `http` to enable Streamable HTTP transport |\n| `MCP_HTTP_TOKEN` | conditional | — | Bearer token. Required when `MCP_TRANSPORT=http` |\n| `MCP_HTTP_PORT` | ❌ | `3000` | HTTP listen port |\n| `MCP_HTTP_HOST` | ❌ | `127.0.0.1` | HTTP bind host (DNS rebinding protection auto-enabled for localhost) |\n| `MCP_HTTP_SKIP_AUTH` | ❌ | `false` | Skip Bearer auth — e.g. behind a reverse proxy that handles it |\n\n**Categories** (16): `metrics`, `monitors`, `dashboards`, `logs`, `apm`, `rum`, `incidents`, `security`, `synthetics`, `ci`, `infra`, `fleet`, `status-pages`, `oncall`, `teams`, `account`.\n\nWhen `MCP_TRANSPORT=http`: `POST /mcp` (Bearer-auth JSON-RPC) + `GET /health` (public liveness).\n\n**Sites**:\n\n| Site | Value | Region |\n|------|-------|--------|\n| US1 | `datadoghq.com` | US (Virginia) |\n| US3 | `us3.datadoghq.com` | US (Virginia) |\n| US5 | `us5.datadoghq.com` | US (Oregon) |\n| EU1 | `datadoghq.eu` | EU (Frankfurt) |\n| AP1 | `ap1.datadoghq.com` | Asia-Pacific (Tokyo) |\n\n### Token efficiency\n\nNaive setup loads ~25K tokens of tool schema before any conversation. Three knobs mitigate:\n\n| Scenario | Tools | Schema tokens | vs default |\n|----------|------:|--------------:|-----------:|\n| default (all categories) | 165 | 25,200 | — |\n| typical (`DD_TOOLS=metrics,monitors,logs,apm,dashboards`) | 55 | 9,300 | −63% |\n| narrow (`DD_TOOLS=metrics,monitors`) | 24 | **3,800** | **−85%** |\n\n1. **Category toggles** — `DD_TOOLS=metrics,monitors,logs,apm` (biggest win).\n2. **`extractFields` response projection** — `get-dashboard { dashboardId: \"abc\", extractFields: \"id,title,widgets.*.definition.type\" }`.\n3. **`search-tools` meta-tool** — always enabled; lets the LLM discover tools at runtime instead of preloading all schemas.\n\n### Read-only mode\n\nBy default, all writes are blocked to prevent accidental mutations by AI agents. The following require `DD_ALLOW_WRITE=true`:\n\n`create-monitor`, `update-monitor`, `delete-monitor`, `mute-monitor`, `create-dashboard`, `update-dashboard`, `delete-dashboard`, `send-logs`, `post-event`, `trigger-synthetics`, `create-synthetics-test`, `update-synthetics-test`, `delete-synthetics-test`, `create-downtime`, `cancel-downtime`, `create-case`, `update-case-status`, `send-dora-deployment`, `send-dora-incident`, `create-slo`, `update-slo`, `delete-slo`, plus all fleet/status-page/security writes.\n\n## MCP Prompts (4)\n\nWorkflow templates the model can invoke directly:\n\n- `triage-incident` — given an incident ID, walks linked monitors, recent error spikes, and recent deploys.\n- `audit-monitor-noise` — flag flapping monitors via alert frequency × MTTR.\n- `analyze-rum-error-spike` — diff RUM error rates across two windows, attribute to top error groups.\n- `investigate-slow-trace` — given a slow trace ID, traverse the span tree and surface bottleneck spans.\n\n## MCP Resources\n\nRead-only entities by URI: `dd://monitor/{id}`, `dd://dashboard/{id}`, `dd://slo/{id}`, `dd://incident/{id}`, `dd://service/{serviceName}`, `dd://team/{teamId}` (team + members), `dd://synthetics/{testId}`, `dd://host/{name}`.\n\n## Tool reference\n\n165 tools across 16 categories. Use the `search-tools` meta-tool to discover at runtime; the full list is collapsed below.\n\n| Domain | Tools |\n|--------|------:|\n| Status Pages | 21 |\n| RUM (events + apps + metrics + retention) | 27 |\n| Metrics, Hosts, SLOs, Downtimes, Containers, Processes | 19 |\n| Fleet Automation | 17 |\n| Synthetics, Logs/Spans Metrics, SLO Corrections | 16 |\n| Monitors, Dashboards, Notebooks, Events | 16 |\n| Incidents, Cases, Error Tracking, Audit | 13 |\n| OnCall, Teams, Users, Services, Bots | 11 |\n| Security signals + rules + suppressions | 9 |\n| APM, CI Visibility, DORA, Network Devices | 9 |\n| **+ aggregations** | `analyze-monitor-state`, `slo-compliance-snapshot` |\n| **+ meta** | `search-tools` |\n\n\u003cdetails\u003e\n\u003csummary\u003eFull tool list (click to expand)\u003c/summary\u003e\n\n### Metrics (5)\n`query-metrics`, `get-metrics`, `get-metric-metadata`, `list-active-metrics`, `list-metric-tags`\n\n### Monitors (7)\n`get-monitors`, `get-monitor`, `create-monitor`, `update-monitor`, `delete-monitor`, `mute-monitor`, `validate-monitor`, `analyze-monitor-state` *(aggregation)*\n\n### Dashboards (5)\n`get-dashboards`, `get-dashboard`, `create-dashboard`, `update-dashboard`, `delete-dashboard`\n\n### Logs (3)\n`search-logs`, `aggregate-logs`, `send-logs`\n\n### Events (2)\n`get-events`, `post-event`\n\n### Incidents (6)\n`get-incidents`, `get-incident`, `search-incidents`, `create-incident`, `update-incident`, `delete-incident`\n\n### APM (1)\n`search-spans`\n\n### RUM (17)\n`search-rum-events`, `aggregate-rum`, `list-rum-applications`, `get-rum-application`, `create-rum-application`, `update-rum-application`, `delete-rum-application`, `list-rum-metrics`, `get-rum-metric`, `create-rum-metric`, `update-rum-metric`, `delete-rum-metric`, `list-rum-retention-filters`, `get-rum-retention-filter`, `create-rum-retention-filter`, `update-rum-retention-filter`, `delete-rum-retention-filter`\n\n### SLOs (6)\n`list-slos`, `get-slo`, `get-slo-history`, `create-slo`, `update-slo`, `delete-slo`, `slo-compliance-snapshot` *(aggregation)*, plus 5 SLO-correction tools\n\n### Synthetics (6)\n`list-synthetics`, `get-synthetics-result`, `trigger-synthetics`, `create-synthetics-test`, `update-synthetics-test`, `delete-synthetics-test`\n\n### Hosts / Containers / Processes (4)\n`list-hosts`, `get-host-totals`, `list-containers`, `list-processes`\n\n### Downtimes (3)\n`list-downtimes`, `create-downtime`, `cancel-downtime`\n\n### Security (9)\n`search-security-signals`, `get-security-signal`, `list-security-rules`, `get-security-rule`, `delete-security-rule`, `list-security-suppressions`, `get-security-suppression`, `create-security-suppression`, `delete-security-suppression`\n\n### CI Visibility (4)\n`search-ci-pipelines`, `aggregate-ci-pipelines`, `search-ci-tests`, `aggregate-ci-tests`\n\n### Cases (4)\n`list-cases`, `get-case`, `create-case`, `update-case-status`\n\n### Error Tracking (2)\n`list-error-tracking-issues`, `get-error-tracking-issue`\n\n### DORA (2)\n`send-dora-deployment`, `send-dora-incident`\n\n### Network Devices (2)\n`list-network-devices`, `get-network-device`\n\n### Notebooks (2)\n`list-notebooks`, `get-notebook`\n\n### OnCall (2)\n`get-team-oncall`, `get-oncall-schedule`\n\n### Services \u0026 Software Catalog (2)\n`list-services`, `get-service-definition`\n\n### Teams (6)\n`list-teams`, `get-team`, `create-team`, `update-team`, `delete-team`, `get-team-members`\n\n### Account \u0026 Users (2)\n`get-usage-summary`, `list-users`\n\n### Logs/Spans/APM Retention metrics (15)\n5 each for `logs-metrics`, `spans-metrics`, `apm-retention-filters` (list/get/create/update/delete)\n\n### Status Pages (21)\nFull lifecycle: pages, components, degradations, maintenances. See `src/tools/status-pages.ts`.\n\n### Fleet Automation (17)\nAgents, deployments, schedules, instrumented pods. See `src/tools/fleet.ts`.\n\n### Audit (1)\n`search-audit-logs`\n\n### Meta (1)\n`search-tools` — query other tools by keyword; always enabled regardless of `DD_TOOLS`.\n\n\u003c/details\u003e\n\n## Architecture\n\n```\nClaude → MCP stdio → index.ts → tools/*.ts → @datadog/datadog-api-client → Datadog API\n```\n\nBuilt on [`@us-all/mcp-toolkit`](https://github.com/us-all/mcp-toolkit):\n- `extractFields` — token-efficient response projections\n- `aggregate(fetchers, caveats)` — fan-out helper for aggregation tools\n- `createWrapToolHandler` — domain-specific redaction (DD_API_KEY/DD_APP_KEY) + Datadog `ApiException` error extraction\n- `search-tools` meta-tool\n\n## Tech stack\n\nNode.js 22+ • TypeScript strict ESM • pnpm • `@modelcontextprotocol/sdk` • `@datadog/datadog-api-client` (official) • zod • dotenv • vitest + dd-trace.\n\n## Contributing\n\nSee [CONTRIBUTING.md](./CONTRIBUTING.md). New shared patterns belong in [`@us-all/mcp-toolkit`](https://github.com/us-all/mcp-toolkit) — single source of truth for the 7-server suite.\n\n## License\n\n[MIT](./LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fus-all%2Fdatadog-mcp-server","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fus-all%2Fdatadog-mcp-server","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fus-all%2Fdatadog-mcp-server/lists"}