{"id":40950756,"url":"https://github.com/txn2/mcp-data-platform","last_synced_at":"2026-05-31T03:01:15.839Z","repository":{"id":333772383,"uuid":"1137195822","full_name":"txn2/mcp-data-platform","owner":"txn2","description":"A semantic data platform MCP server that composes multiple data tools with bidirectional cross-injection - tool responses automatically include critical context from other services.","archived":false,"fork":false,"pushed_at":"2026-05-30T19:55:39.000Z","size":11615,"stargazers_count":6,"open_issues_count":8,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-30T21:20:07.533Z","etag":null,"topics":["data-analysis","data-lake","data-warehouse","golang","golang-library","mcp","mcp-server"],"latest_commit_sha":null,"homepage":"http://mcp-data-platform.txn2.com/","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/txn2.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":"SECURITY.md","support":"docs/support/troubleshooting.md","governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null},"funding":{"github":["cjimti"]}},"created_at":"2026-01-19T03:40:44.000Z","updated_at":"2026-05-30T19:54:49.000Z","dependencies_parsed_at":"2026-05-16T01:04:41.507Z","dependency_job_id":null,"html_url":"https://github.com/txn2/mcp-data-platform","commit_stats":null,"previous_names":["txn2/mcp-data-platform"],"tags_count":210,"template":false,"template_full_name":null,"purl":"pkg:github/txn2/mcp-data-platform","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/txn2%2Fmcp-data-platform","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/txn2%2Fmcp-data-platform/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/txn2%2Fmcp-data-platform/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/txn2%2Fmcp-data-platform/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/txn2","download_url":"https://codeload.github.com/txn2/mcp-data-platform/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/txn2%2Fmcp-data-platform/sbom","scorecard":{"id":1242065,"data":{"date":"2026-01-21T19:11:12Z","repo":{"name":"github.com/txn2/mcp-data-platform","commit":"f72d635ce3ecdd36a65b112f3dcce37e727b2a98"},"scorecard":{"version":"v5.3.0","commit":"c22063e786c11f9dd714d777a687ff7c4599b600"},"score":7.9,"checks":[{"name":"Security-Policy","score":10,"reason":"security policy file detected","details":["Info: security policy file detected: SECURITY.md:1","Info: Found linked content: SECURITY.md:1","Info: Found disclosure, vulnerability, and/or timelines in security policy: SECURITY.md:1","Info: Found text in security policy: SECURITY.md:1"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#security-policy"}},{"name":"Dependency-Update-Tool","score":10,"reason":"update tool detected","details":["Info: detected update tool: Dependabot: .github/dependabot.yml:1"],"documentation":{"short":"Determines if the project uses a dependency update tool.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#dependency-update-tool"}},{"name":"Maintained","score":0,"reason":"project was created within the last 90 days. Please review its contents carefully","details":["Warn: Repository was created within the last 90 days."],"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#maintained"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#binary-artifacts"}},{"name":"Token-Permissions","score":10,"reason":"GitHub workflow tokens follow principle of least privilege","details":["Info: jobLevel 'contents' permission set to 'read': .github/workflows/ci.yml:16","Info: jobLevel 'contents' permission set to 'read': .github/workflows/ci.yml:37","Info: jobLevel 'contents' permission set to 'read': .github/workflows/ci.yml:73","Info: jobLevel 'contents' permission set to 'read': .github/workflows/ci.yml:94","Info: jobLevel 'contents' permission set to 'read': .github/workflows/codeql.yml:19","Info: topLevel permissions set to 'read-all': .github/workflows/ci.yml:9","Info: topLevel permissions set to 'read-all': .github/workflows/codeql.yml:11","Info: topLevel 'contents' permission set to 'read': .github/workflows/docs.yml:14","Info: found token with 'none' permissions: .github/workflows/release.yml:1","Info: topLevel permissions set to 'read-all': .github/workflows/scorecard.yml:10","Info: no jobLevel write permissions found"],"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#token-permissions"}},{"name":"Code-Review","score":0,"reason":"Found 0/27 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#code-review"}},{"name":"Dangerous-Workflow","score":10,"reason":"no dangerous workflow patterns detected","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#dangerous-workflow"}},{"name":"Pinned-Dependencies","score":9,"reason":"dependency not pinned by hash detected -- score normalized to 9","details":["Warn: pipCommand not pinned by hash: .github/workflows/docs.yml:36","Info:  22 out of  22 GitHub-owned GitHubAction dependencies pinned","Info:  10 out of  10 third-party GitHubAction dependencies pinned","Info:   0 out of   1 pipCommand dependencies pinned","Info:   1 out of   1 containerImage dependencies pinned","Info:   1 out of   1 goCommand dependencies pinned"],"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#pinned-dependencies"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: Apache License 2.0: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#license"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#cii-best-practices"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#packaging"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#vulnerabilities"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":8,"reason":"branch protection is not maximal on development and all release branches","details":["Info: 'allow deletion' disabled on branch 'main'","Info: 'force pushes' disabled on branch 'main'","Warn: 'branch protection settings apply to administrators' is disabled on branch 'main'","Info: 'stale review dismissal' is required to merge on branch 'main'","Warn: required approving review count is 1 on branch 'main'","Warn: codeowners review is not required on branch 'main'","Info: 'last push approval' is required to merge on branch 'main'","Info: 'up-to-date branches' is required to merge on branch 'main'","Info: status check found to merge onto on branch 'main'","Info: PRs are required in order to make changes on branch 'main'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#branch-protection"}},{"name":"SAST","score":10,"reason":"SAST tool is run on all commits","details":["Info: SAST configuration detected: CodeQL","Info: all commits (5) are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#sast"}},{"name":"Fuzzing","score":10,"reason":"project is fuzzed","details":["Info: GoBuiltInFuzzer integration found: pkg/auth/fuzz_test.go:11","Info: GoBuiltInFuzzer integration found: pkg/auth/fuzz_test.go:42","Info: GoBuiltInFuzzer integration found: pkg/auth/fuzz_test.go:76","Info: GoBuiltInFuzzer integration found: pkg/auth/fuzz_test.go:101","Info: GoBuiltInFuzzer integration found: pkg/middleware/fuzz_test.go:10","Info: GoBuiltInFuzzer integration found: pkg/middleware/fuzz_test.go:32","Info: GoBuiltInFuzzer integration found: pkg/middleware/fuzz_test.go:49","Info: GoBuiltInFuzzer integration found: pkg/oauth/fuzz_test.go:58","Info: GoBuiltInFuzzer integration found: pkg/oauth/fuzz_test.go:101","Info: GoBuiltInFuzzer integration found: pkg/oauth/fuzz_test.go:122","Info: GoBuiltInFuzzer integration found: pkg/platform/fuzz_test.go:10","Info: GoBuiltInFuzzer integration found: pkg/platform/fuzz_test.go:78","Info: GoBuiltInFuzzer integration found: pkg/platform/fuzz_test.go:96"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#fuzzing"}},{"name":"Contributors","score":10,"reason":"project has 4 contributing companies or organizations","details":["Info: found contributions from: DeasilCognitive, apk8s, deasilworks, txn2"],"documentation":{"short":"Determines if the project has a set of contributors from multiple organizations (e.g., companies).","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#contributors"}},{"name":"CI-Tests","score":10,"reason":"5 out of 5 merged PRs checked by a CI test -- score normalized to 10","details":null,"documentation":{"short":"Determines if the project runs tests before pull requests are merged.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#ci-tests"}}]},"last_synced_at":"2026-01-21T19:37:39.785Z","repository_id":333772383,"created_at":"2026-01-21T19:37:39.785Z","updated_at":"2026-01-21T19:37:39.785Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33717419,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-31T02:00:06.040Z","response_time":95,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-analysis","data-lake","data-warehouse","golang","golang-library","mcp","mcp-server"],"created_at":"2026-01-22T05:11:57.646Z","updated_at":"2026-05-31T03:01:15.832Z","avatar_url":"https://github.com/txn2.png","language":"Go","funding_links":["https://github.com/sponsors/cjimti"],"categories":[],"sub_categories":[],"readme":"[![txn2/mcp-data-platform](docs/images/MCP-data-platform-logo-banner.svg)](https://mcp-data-platform.txn2.com)\n\n[![GitHub license](https://img.shields.io/github/license/txn2/mcp-data-platform.svg)](https://github.com/txn2/mcp-data-platform/blob/main/LICENSE)\n[![Go Reference](https://pkg.go.dev/badge/github.com/txn2/mcp-data-platform.svg)](https://pkg.go.dev/github.com/txn2/mcp-data-platform)\n[![codecov](https://codecov.io/gh/txn2/mcp-data-platform/graph/badge.svg)](https://codecov.io/gh/txn2/mcp-data-platform)\n[![Go Report Card](https://goreportcard.com/badge/github.com/txn2/mcp-data-platform?v2)](https://goreportcard.com/report/github.com/txn2/mcp-data-platform)\n[![OpenSSF Scorecard](https://api.scorecard.dev/projects/github.com/txn2/mcp-data-platform/badge)](https://scorecard.dev/viewer/?uri=github.com/txn2/mcp-data-platform)\n[![SLSA 3](https://slsa.dev/images/gh-badge-level3.svg)](https://slsa.dev)\n\n\n\n**[Documentation](https://mcp-data-platform.txn2.com/)** | **[Installation](https://mcp-data-platform.txn2.com/server/overview/)** | **[Library Docs](https://mcp-data-platform.txn2.com/library/overview/)**\n\n**Your AI assistant can run SQL. But it doesn't know that `cust_id` contains PII, that the table was deprecated last month, or who to ask when something breaks.**\n\nmcp-data-platform fixes that. It connects AI assistants to your data infrastructure and adds business context from your semantic layer. Query a table and get its meaning, owners, quality scores, and deprecation warnings in the same response.\n\nThe only requirement is [DataHub](https://datahubproject.io/) as your semantic layer. Add [Trino](https://trino.io/) for SQL queries and [S3](https://aws.amazon.com/s3/) for object storage when you're ready. [Learn why this stack →](https://mcp-data-platform.txn2.com/concepts/components/)\n\n## MCP Data Platform Ecosystem\n\nmcp-data-platform is the orchestration layer for a broader suite of open-source MCP servers designed to work together as a composable data platform. Each component can run standalone or be combined through mcp-data-platform for unified access with cross-enrichment, authentication, and personas.\n\n- [txn2/mcp-datahub](https://github.com/txn2/mcp-datahub/) — DataHub metadata catalog: search, lineage, glossary terms, domains, tags, and ownership\n- [txn2/mcp-s3](https://github.com/txn2/mcp-s3/) — S3 object storage: list buckets, browse prefixes, read objects, generate presigned URLs\n- [txn2/mcp-trino](https://github.com/txn2/mcp-trino/) — Trino distributed SQL: query any data source Trino connects to with configurable timeouts and row limits\n\nThe platform also includes a **[gateway toolkit](https://mcp-data-platform.txn2.com/server/gateway/)** that re-exposes any well-behaved third-party MCP server through the platform's auth, persona, and audit pipeline. Operators add connections through the admin portal (DB-backed, encrypted credentials); tools surface as `\u003cconnection\u003e__\u003cremote_tool\u003e`. Optional declarative cross-enrichment rules join proxied responses with Trino queries or DataHub lookups, so a vendor MCP can return its own data plus warehouse context in a single call.\n\nFor REST/HTTP APIs that aren't MCP servers, the **[API gateway toolkit](https://mcp-data-platform.txn2.com/server/api-gateway/)** (`kind: api`) proxies upstreams like Salesforce, Google APIs, GitHub, and Stripe through the same pipeline. Four tools (`api_invoke_endpoint`, `api_list_endpoints`, `api_list_specs`, `api_get_endpoint_schema`) cover every operation on every upstream; `api_list_specs` lets the model browse a multi-spec catalog's sections before drilling into one. Auth modes cover bearer, API key, HTTP Basic (RFC 7617, for legacy APIs like Jenkins or on-prem Jira), OAuth 2.1 client_credentials, and OAuth 2.1 authorization_code with browser sign-in. `static_headers` adds operator-supplied per-call headers alongside the auth header, so APIs that require both an OAuth bearer AND a project/subscription header (Google's `x-goog-user-project` for quota billing, vendor subscription keys) work without code changes. A REST shim at `POST /api/v1/gateway/{connection}/invoke` exposes `api_invoke_endpoint` to non-MCP HTTP clients (Apache NiFi, Airflow `HttpOperator`, `curl`) using the same `Authorization`/`X-API-Key` headers, persona allowlists, route-policy gates, and audit pipeline.\n\nOpenAPI specs are stored in versioned **[API catalogs](https://mcp-data-platform.txn2.com/server/api-catalogs/)**: globally-owned bundles of component specs that any number of connections can reference. One Google Workspace catalog (drive.yaml, calendar.yaml, gmail.yaml) backs every Google Workspace connection in the deployment; an organization running a Salesforce sandbox and a Salesforce production org points both connections at one Salesforce catalog. Specs ingest via paste, file upload, or HTTPS URL (with strict SSRF guards); mutations fan out to live connections without a process restart.\n\n---\n\n## Why mcp-data-platform?\n\n**The Problem**: AI assistants are powerful at querying data, but they're working blind. When Claude asks \"What's in the orders table?\", it gets column names and types. It doesn't know:\n\n- The `customer_id` column contains PII requiring special handling\n- The table is deprecated in favor of `orders_v2`\n- The data quality score dropped last week\n- Who to contact when something looks wrong\n\n**The Solution**: mcp-data-platform injects semantic context at the protocol level. Your AI assistant gets business meaning automatically—before it even asks.\n\n### Without vs With\n\n```\n# Without mcp-data-platform\n─────────────────────────────────────────────────────────────────────\nUser:      \"Describe the orders table\"\nAI:        Queries Trino → gets columns and types\nUser:      \"Who owns this data?\"\nAI:        Queries DataHub → finds owners\nUser:      \"Is this table still active?\"\nAI:        Queries DataHub again → finds deprecation status\nUser:      \"What does customer_id actually mean?\"\nAI:        Queries DataHub again → finds column descriptions\n─────────────────────────────────────────────────────────────────────\n4 round trips. Context scattered across conversations. Easy to miss warnings.\n```\n\n```\n# With mcp-data-platform\n─────────────────────────────────────────────────────────────────────\nUser:      \"Describe the orders table\"\nAI:        Gets everything in one response:\n           → Schema: columns and types\n           → ⚠️ DEPRECATED: Use orders_v2 instead\n           → Owners: Data Platform Team\n           → Tags: pii, financial\n           → Quality Score: 87%\n           → Column meanings and business definitions\n─────────────────────────────────────────────────────────────────────\n1 call. Complete context. Warnings front and center.\n```\n\n---\n\n## How It Works\n\n```mermaid\nsequenceDiagram\n    participant AI as AI Assistant\n    participant P as mcp-data-platform\n    participant T as Trino\n    participant D as DataHub\n\n    AI-\u003e\u003eP: trino_describe_table \"orders\"\n    P-\u003e\u003eT: DESCRIBE orders\n    T--\u003e\u003eP: columns, types\n    P-\u003e\u003eD: Get semantic context\n    D--\u003e\u003eP: description, owners, tags, quality, deprecation\n    P--\u003e\u003eAI: Schema + Full Business Context\n```\n\nThe platform intercepts tool responses and enriches them with semantic metadata. This **cross-enrichment** pattern means:\n\n- **Trino → DataHub**: Query results include owners, tags, glossary terms, deprecation warnings, quality scores\n- **DataHub → Trino**: Search results include query availability (can this dataset be queried? what's the SQL?)\n- **S3 → DataHub**: Object listings include matching dataset metadata\n- **DataHub → S3**: Dataset searches show storage availability\n\n---\n\n## Features\n\n### Semantic-First Data Access\nEvery data query includes business context from DataHub. Table descriptions, column meanings, data quality scores, and ownership information flow automatically. Your AI assistant understands what data means, not just what it contains.\n\n### Bidirectional Cross-Enrichment\nContext flows between services automatically. Trino results come enriched with DataHub metadata. DataHub searches show which datasets are queryable in Trino. No manual lookups or separate API calls needed.\n\n### Workflow Gating\nLLM agents tend to skip DataHub discovery and jump straight to SQL. Session-aware workflow gating detects this and annotates query results with warnings when no discovery has occurred. Warnings escalate after repeated violations. Built-in description overrides on `trino_query` and `trino_execute` also guide agents to call `datahub_search` first. See the [Middleware Reference](https://mcp-data-platform.txn2.com/reference/middleware/) for details.\n\n### Enterprise Security\nBuilt with a **fail-closed** security model. Missing credentials deny access—never bypass. TLS enforcement for HTTP transport, prompt injection protection, and read-only mode enforcement for sensitive environments. See [MCP Defense: A Case Study in AI Security](https://imti.co/mcp-defense/) for the security architecture rationale.\n\n### OAuth 2.1 Authentication\nNative support for OIDC providers (Keycloak, Auth0, Okta), API keys for service accounts, PKCE for public clients, and Dynamic Client Registration. Claude Desktop can authenticate through your existing identity provider. Outbound gateway connections send `oauth_scope` to the IdP verbatim — operators add `offline_access` (Keycloak/Auth0/Okta) or `refresh_token` (Salesforce) themselves to get refresh tokens that survive platform restarts beyond the IdP's interactive SSO session timeout.\n\n### Live Tool Inventory Updates\nWhen a gateway upstream re-authenticates or a connection is added/removed, downstream agents (Claude.ai, Claude Desktop) receive a `notifications/tools/list_changed` event over a long-lived SSE channel — no disconnect / reconnect required. Works in stateless streamable-HTTP mode (the multi-replica deployment shape) via a postgres `LISTEN/NOTIFY` broadcaster; falls back to in-memory fan-out for single-replica deployments.\n\n### Role-Based Personas\nDefine who can use which tools. Analysts get read access to queries and searches. Admins get everything. Tool filtering uses wildcard patterns (allow/deny rules) mapped from your identity provider's roles.\n\n### Comprehensive Audit Logging\nEvery tool call is logged with user identity, persona, request details, and timing. PostgreSQL-backed for querying and compliance. Know who queried what, when, and why.\n\n### Prometheus Metrics\nOpenTelemetry instrumentation exposes `/metrics` on a dedicated `:9090` listener. Phase 1 instruments two chokepoints: every MCP tool call (`mcp_tool_calls_total`, `mcp_tool_call_duration_seconds`, `mcp_inflight_tool_calls`) and every apigateway outbound HTTP call (`apigateway_outbound_total`, `apigateway_outbound_duration_seconds`), each with a small, bounded label set (`tool`, `toolkit_kind`, `persona`, `status_category`, `connection`, `http_status_class`). High-cardinality fields like user id and raw URLs are deliberately kept off labels and reserved for traces (Phase 2). Enabled by default; set `OTEL_METRICS_ENABLED=false` to disable. See the [Observability documentation](https://mcp-data-platform.txn2.com/server/observability/) for details.\n\n### Persistent Memory\nAgents accumulate knowledge across sessions: preferences, corrections, domain context, and institutional facts. Backed by PostgreSQL with pgvector for semantic search. The `memory_manage` tool provides CRUD operations, `memory_recall` offers multi-strategy retrieval (entity lookup, vector similarity, DataHub lineage traversal). Memories are automatically injected into toolkit responses via the cross-enrichment middleware. A staleness watcher flags memories when referenced DataHub entities change. Scoped by user and persona with full audit logging. See the [Memory Layer documentation](https://mcp-data-platform.txn2.com/memory/overview/) for details.\n\n### Knowledge Capture\nAI sessions generate valuable domain knowledge: column meanings, data quality issues, business rules. The `capture_insight` tool records these observations during sessions (now backed by the memory layer with vector embeddings), and `apply_knowledge` provides admins with a structured review workflow. Approved insights are written back to DataHub with full changeset tracking and rollback. An [Admin REST API](https://mcp-data-platform.txn2.com/knowledge/admin-api/) supports integration with existing governance tools. See the [Knowledge Capture documentation](https://mcp-data-platform.txn2.com/knowledge/overview/) for details.\n\n### Resource Templates\nBrowse platform data as parameterized MCP resources using RFC 6570 URI templates. Three built-in templates expose table schemas (`schema://catalog.schema/table`), glossary terms (`glossary://term`), and data availability (`availability://catalog.schema/table`) without making tool calls.\n\n### Managed Resources\nHuman-uploaded reference material (samples, playbooks, templates, references) surfaced directly to AI assistants via MCP `resources/list` and `resources/read`. Resources are scoped to three visibility levels: global (visible to all authenticated users), persona (visible to users in a specific persona), and user (visible only to the owner). Metadata is stored in PostgreSQL; file blobs are stored in S3. A REST API at `/api/v1/resources` provides CRUD operations, and the Admin Portal includes a dedicated Resources page for uploading, browsing, and managing resources. Enabled automatically when a database is available.\n\n### Progress Notifications\nLong-running Trino queries send granular progress updates to MCP clients (executing, formatting, complete). Clients that provide a `_meta.progressToken` receive real-time status. Zero overhead when disabled.\n\n### Client Logging\nServer-to-client log messages give AI agents visibility into platform decisions (enrichment applied, timing). Uses the MCP `logging/setLevel` protocol — zero overhead if the client hasn't opted in.\n\n### Extensible Middleware Architecture\nAdd custom authentication, rate limiting, or logging. Swap providers to integrate different semantic layers or query engines. The Go library exposes everything—build the platform your organization needs.\n\n---\n\n## Admin Portal\n\nA built-in web dashboard for monitoring, auditing, and managing the platform. Enable with `portal.enabled: true`.\n\n![Admin Dashboard](docs/images/screenshots/light/admin-admin-dashboard-light.webp)\n\n**Dashboard** — Real-time activity timelines, top tools/users, performance percentiles, error monitoring, knowledge insight summary, and connection health.\n\n![Admin Tools page](docs/images/screenshots/light/admin-admin-tools-overview-light.webp)\n\n**Tools** — Master-detail surface for the full tool inventory. Search and group by connection or kind; drill into any tool to see its routing, persona allow/deny matrix, 24h audit aggregate, and cross-enrichment rules. Edit the per-tool description override, run the tool inline with auto-generated forms, and toggle global visibility (`tools.deny`) without leaving the page.\n\nSee the [Admin Portal documentation](https://mcp-data-platform.txn2.com/server/admin-portal/) for the complete visual guide.\n\n---\n\n## Use Cases\n\n### Enterprise Data Governance\n- **Compliance-Ready Audit Trails**: Every query logged with user identity and business justification\n- **PII Protection**: Tag-based warnings ensure AI assistants acknowledge sensitive data handling requirements\n- **Access Control**: Persona system enforces who can query what, mapped from your IdP\n- **Deprecation Enforcement**: Deprecated tables surface warnings before AI assistants use stale data\n\n### Data Democratization\n- **Self-Service Analytics**: Business users explore data through AI with context they'd otherwise need to ask engineers for\n- **Cross-Team Discovery**: Search finds datasets across all systems with unified metadata\n- **Onboarding Acceleration**: New team members understand data assets immediately—meanings, owners, quality, and lineage included\n- **Glossary-Driven Exploration**: Business terms connect to actual tables and columns automatically\n\n### AI/ML Workflows\n- **Autonomous Data Exploration**: AI agents discover and understand datasets without human guidance\n- **Feature Discovery**: Find and evaluate potential ML features with quality scores and lineage\n- **Pipeline Understanding**: Trace data lineage to understand feature provenance\n- **Quality Gates**: Data quality scores help AI agents avoid problematic datasets\n\n---\n\n## Architecture\n\n```mermaid\ngraph LR\n    subgraph \"MCP Data Platform\"\n        DataHub[DataHub\u003cbr/\u003eSemantic Metadata]\n        Platform[Platform\u003cbr/\u003eBridge]\n        Trino[Trino\u003cbr/\u003eQuery Engine]\n        S3[S3\u003cbr/\u003eObject Storage]\n\n        DataHub \u003c--\u003e|\"enrichment\"| Platform\n        Platform \u003c--\u003e|\"enrichment\"| Trino\n        Platform \u003c--\u003e|\"enrichment\"| S3\n    end\n\n    Client([MCP Client]) --\u003e Platform\n    Platform --\u003e Client\n```\n\n---\n\n## Security\n\nmcp-data-platform implements a **fail-closed** security model designed for enterprise deployments. See [MCP Defense: A Case Study in AI Security](https://imti.co/mcp-defense/) for the security architecture rationale.\n\n| Feature | Description |\n|---------|-------------|\n| **Fail-Closed Authentication** | Missing or invalid credentials deny access (never bypass) |\n| **Required JWT Claims** | Tokens must include `sub` and `exp` claims |\n| **TLS for HTTP Transport** | Configurable TLS with warnings for plaintext connections |\n| **Prompt Injection Protection** | Metadata sanitization prevents injection attacks |\n| **Read-Only Mode** | Trino and S3 toolkits support enforced read-only access |\n| **Default-Deny Personas** | Users without explicit persona assignment have no tool access |\n| **Cryptographic Request IDs** | Request tracing uses secure random identifiers |\n\n### Transport Security\n\n| Transport | Authentication | TLS |\n|-----------|---------------|-----|\n| **stdio** | Not required (local execution) | N/A |\n| **HTTP** | Required (Bearer token or API key) | Strongly recommended |\n\n---\n\n## Installation\n\n### Go Install\n\n```bash\ngo install github.com/txn2/mcp-data-platform/cmd/mcp-data-platform@latest\n```\n\n### From Source\n\n```bash\ngit clone https://github.com/txn2/mcp-data-platform.git\ncd mcp-data-platform\ngo build -o mcp-data-platform ./cmd/mcp-data-platform\n```\n\n---\n\n## Quick Start\n\n### Standalone Server\n\n```bash\n# Run with stdio transport (default)\n./mcp-data-platform\n\n# Run with configuration file\n./mcp-data-platform --config configs/platform.yaml\n\n# Run with HTTP transport (serves both SSE and Streamable HTTP)\n./mcp-data-platform --transport http --address :8080\n```\n\n### Claude Code CLI\n\n```bash\nclaude mcp add mcp-data-platform -- mcp-data-platform\n```\n\n### Claude Desktop (Local)\n\nAdd to your `claude_desktop_config.json`:\n\n```json\n{\n  \"mcpServers\": {\n    \"mcp-data-platform\": {\n      \"command\": \"mcp-data-platform\",\n      \"args\": [\"--config\", \"/path/to/platform.yaml\"]\n    }\n  }\n}\n```\n\n### Claude Desktop (Remote with OAuth)\n\nFor connecting Claude Desktop to a remote MCP server with Keycloak authentication:\n\n1. **Configure the MCP server** with OAuth and upstream IdP:\n\n```yaml\nserver:\n  transport: http\n  address: \":8080\"\n\noauth:\n  enabled: true\n  issuer: \"https://mcp.example.com\"\n  clients:\n    - id: \"claude-desktop\"\n      secret: \"${CLAUDE_CLIENT_SECRET}\"\n      redirect_uris:\n        - \"http://localhost\"\n        - \"http://127.0.0.1\"\n  upstream:\n    issuer: \"https://keycloak.example.com/realms/your-realm\"\n    client_id: \"mcp-data-platform\"\n    client_secret: \"${KEYCLOAK_CLIENT_SECRET}\"\n    redirect_uri: \"https://mcp.example.com/oauth/callback\"\n```\n\n2. **In Claude Desktop**, add the server with OAuth credentials:\n   - **URL**: `https://mcp.example.com`\n   - **Client ID**: `claude-desktop`\n   - **Client Secret**: (the secret you configured)\n\nWhen you connect, Claude Desktop will open your browser for Keycloak login, then automatically complete the OAuth flow.\n\nSee [OAuth 2.1 Server documentation](https://mcp-data-platform.txn2.com/auth/oauth-server/) for complete setup instructions.\n\n---\n\n## Configuration\n\nCreate a `platform.yaml` configuration file:\n\n```yaml\nserver:\n  name: mcp-data-platform\n  transport: stdio\n\nauth:\n  oidc:\n    enabled: true\n    issuer: \"https://auth.example.com/realms/platform\"\n    client_id: \"mcp-data-platform\"\n  api_keys:\n    enabled: true\n    keys:\n      - key: \"${API_KEY_ADMIN}\"\n        name: \"admin\"\n        roles: [\"admin\"]\n\npersonas:\n  definitions:\n    analyst:\n      display_name: \"Data Analyst\"\n      roles: [\"analyst\"]\n      tools:\n        allow: [\"trino_*\", \"datahub_*\"]\n        deny: [\"*_delete_*\"]\n    admin:\n      display_name: \"Administrator\"\n      roles: [\"admin\"]\n      tools:\n        allow: [\"*\"]\n  default_persona: analyst\n\nsemantic:\n  provider: datahub\n  cache:\n    enabled: true\n    ttl: 5m\n\ninjection:\n  trino_semantic_enrichment: true\n  datahub_query_enrichment: true\n  column_context_filtering: true   # Only enrich columns referenced in SQL (default: true)\n\naudit:\n  enabled: true\n  log_tool_calls: true\n  retention_days: 90\n\ndatabase:\n  dsn: \"${DATABASE_URL}\"\n```\n\n### Managed Resources\n```yaml\nresources:\n  managed:\n    enabled: true             # auto-enabled when database is available\n    uri_scheme: \"mcp\"         # URI prefix (default: \"mcp\")\n    s3_connection: \"primary\"  # name of S3 toolkit instance for blob storage\n    s3_bucket: \"resources\"    # S3 bucket for uploaded files\n```\n\n### Environment Variables\n\n| Variable | Description | Default |\n|----------|-------------|---------|\n| `DATABASE_URL` | PostgreSQL connection string for audit logs | - |\n| `API_KEY_ADMIN` | Admin API key (if using API key auth) | - |\n\n---\n\n## Core Packages\n\n| Package | Description |\n|---------|-------------|\n| `pkg/platform` | Main platform facade and configuration |\n| `pkg/auth` | OIDC and API key authentication |\n| `pkg/oauth` | OAuth 2.1 server with DCR and PKCE |\n| `pkg/persona` | Role-based personas and tool filtering |\n| `pkg/semantic` | Semantic metadata provider abstraction |\n| `pkg/query` | Query execution provider abstraction |\n| `pkg/middleware` | Request/response middleware chain |\n| `pkg/mcpcontext` | MCP session/progress context helpers |\n| `pkg/registry` | Toolkit registration and management |\n| `pkg/audit` | Audit logging with PostgreSQL storage |\n| `pkg/tuning` | Prompts, hints, and operational rules |\n| `pkg/storage` | S3-compatible storage provider abstraction |\n| `pkg/portal` | Asset portal types, stores, and S3 client for AI-generated artifacts |\n| `pkg/resource` | Managed resources: scoped file uploads, REST API, MCP integration |\n| `pkg/toolkits` | Toolkit implementations (Trino, DataHub, S3, Knowledge, Memory, Portal, Gateway) |\n| `pkg/admin` | Admin REST API for tools, personas, config, audit, knowledge, memory, connections, gateway, OAuth, and resources |\n| `pkg/client` | Platform client utilities |\n\n---\n\n## Library Usage\n\nThe platform can be imported and used as a library:\n\n```go\nimport (\n    \"github.com/txn2/mcp-data-platform/pkg/platform\"\n)\n\n// Load configuration\ncfg, err := platform.LoadConfig(\"platform.yaml\")\nif err != nil {\n    log.Fatal(err)\n}\n\n// Create platform\np, err := platform.New(platform.WithConfig(cfg))\nif err != nil {\n    log.Fatal(err)\n}\ndefer p.Close()\n\n// Start the platform\nif err := p.Start(ctx); err != nil {\n    log.Fatal(err)\n}\n\n// Access the MCP server\nmcpServer := p.MCPServer()\n```\n\n---\n\n## Development\n\n```bash\n# Run tests with race detection\ngo test -race ./...\n\n# Run linter\ngolangci-lint run ./...\n\n# Run security scan\ngosec ./...\n\n# Run SAST (Semgrep + CodeQL)\nmake sast\n\n# Build\ngo build -o mcp-data-platform ./cmd/mcp-data-platform\n```\n\n---\n\n## Documentation\n\nFull documentation is available at [mcp-data-platform.txn2.com](https://mcp-data-platform.txn2.com/).\n\n- [Server Guide](https://mcp-data-platform.txn2.com/server/overview/) - Configuration and deployment\n- [Cross-Enrichment](https://mcp-data-platform.txn2.com/cross-enrichment/overview/) - How automatic enrichment works\n- [Authentication](https://mcp-data-platform.txn2.com/auth/overview/) - OIDC, API keys, and OAuth 2.1\n- [Go Library](https://mcp-data-platform.txn2.com/library/overview/) - Build custom MCP servers\n- [API Reference](https://mcp-data-platform.txn2.com/reference/tools-api/) - Complete tool documentation\n\n---\n\n## Contributing\n\nWe welcome contributions for bug fixes, tests, and documentation. Please ensure:\n\n1. All tests pass (`go test -race ./...`)\n2. Code is formatted (`gofmt`)\n3. Linter passes (`golangci-lint run ./...`)\n4. Security scan passes (`gosec ./...`)\n5. SAST passes (`make sast` — Semgrep + CodeQL)\n\n---\n\n## License\n\n[Apache License 2.0](LICENSE)\n\n---\n\nOpen source by [Craig Johnston](https://twitter.com/cjimti), sponsored by [Deasil Works, Inc.](https://deasil.works/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftxn2%2Fmcp-data-platform","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftxn2%2Fmcp-data-platform","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftxn2%2Fmcp-data-platform/lists"}