{"id":50380773,"url":"https://github.com/cyanheads/socrata-mcp-server","last_synced_at":"2026-05-30T12:00:31.623Z","repository":{"id":359910172,"uuid":"1247733590","full_name":"cyanheads/socrata-mcp-server","owner":"cyanheads","description":"Search and query government open-data portals (Socrata SODA API) via MCP. STDIO or Streamable HTTP.","archived":false,"fork":false,"pushed_at":"2026-05-24T05:00:40.000Z","size":404,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-24T05:05:14.974Z","etag":null,"topics":["ai","bun","civic-data","government-data","llm","mcp","mcp-server","model-context-protocol","open-data","socrata","soda-api","stdio","streamable-http","typescript"],"latest_commit_sha":null,"homepage":"https://www.npmjs.com/package/@cyanheads/socrata-mcp-server","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cyanheads.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-23T17:52:15.000Z","updated_at":"2026-05-24T05:00:42.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/cyanheads/socrata-mcp-server","commit_stats":null,"previous_names":["cyanheads/socrata-mcp-server"],"tags_count":4,"template":false,"template_full_name":null,"purl":"pkg:github/cyanheads/socrata-mcp-server","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cyanheads%2Fsocrata-mcp-server","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cyanheads%2Fsocrata-mcp-server/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cyanheads%2Fsocrata-mcp-server/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cyanheads%2Fsocrata-mcp-server/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cyanheads","download_url":"https://codeload.github.com/cyanheads/socrata-mcp-server/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cyanheads%2Fsocrata-mcp-server/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33691312,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-30T02:00:06.278Z","response_time":92,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","bun","civic-data","government-data","llm","mcp","mcp-server","model-context-protocol","open-data","socrata","soda-api","stdio","streamable-http","typescript"],"created_at":"2026-05-30T12:00:28.154Z","updated_at":"2026-05-30T12:00:31.591Z","avatar_url":"https://github.com/cyanheads.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n  \u003ch1\u003e@cyanheads/socrata-mcp-server\u003c/h1\u003e\n  \u003cp\u003e\u003cb\u003eSearch and query government open-data portals (Socrata SODA API) via MCP. STDIO or Streamable HTTP.\u003c/b\u003e\n  \u003cdiv\u003e6 Tools • 2 Resources • 1 Prompt\u003c/div\u003e\n  \u003c/p\u003e\n\u003c/div\u003e\n\n\u003cdiv align=\"center\"\u003e\n\n[![Version](https://img.shields.io/badge/Version-0.1.5-blue.svg?style=flat-square)](./CHANGELOG.md) [![License](https://img.shields.io/badge/License-Apache%202.0-orange.svg?style=flat-square)](./LICENSE) [![Docker](https://img.shields.io/badge/Docker-ghcr.io-2496ED?style=flat-square\u0026logo=docker\u0026logoColor=white)](https://github.com/users/cyanheads/packages/container/package/socrata-mcp-server) [![MCP SDK](https://img.shields.io/badge/MCP%20SDK-^1.29.0-green.svg?style=flat-square)](https://modelcontextprotocol.io/) [![npm](https://img.shields.io/npm/v/%40cyanheads%2Fsocrata-mcp-server?style=flat-square\u0026logo=npm\u0026logoColor=white)](https://www.npmjs.com/package/@cyanheads/socrata-mcp-server) [![TypeScript](https://img.shields.io/badge/TypeScript-^6.0.3-3178C6.svg?style=flat-square)](https://www.typescriptlang.org/) [![Bun](https://img.shields.io/badge/Bun-v1.3.0-blueviolet.svg?style=flat-square)](https://bun.sh/)\n\n\u003c/div\u003e\n\n\u003cdiv align=\"center\"\u003e\n\n[![Install in Claude Desktop](https://img.shields.io/badge/Install_in-Claude_Desktop-D97757?style=for-the-badge\u0026logo=anthropic\u0026logoColor=white)](https://github.com/cyanheads/socrata-mcp-server/releases/latest/download/socrata-mcp-server.mcpb) [![Install in Cursor](https://cursor.com/deeplink/mcp-install-dark.svg)](https://cursor.com/en/install-mcp?name=socrata-mcp-server\u0026config=eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyIteSIsIkBjeWFuaGVhZHMvc29jcmF0YS1tY3Atc2VydmVyIl19) [![Install in VS Code](https://img.shields.io/badge/VS_Code-Install_Server-0098FF?style=for-the-badge\u0026logo=visualstudiocode\u0026logoColor=white)](https://vscode.dev/redirect?url=vscode:mcp/install?%7B%22name%22%3A%22socrata-mcp-server%22%2C%22command%22%3A%22npx%22%2C%22args%22%3A%5B%22-y%22%2C%22%40cyanheads%2Fsocrata-mcp-server%22%5D%7D)\n\n[![Framework](https://img.shields.io/badge/Built%20on-@cyanheads/mcp--ts--core-67E8F9?style=flat-square)](https://www.npmjs.com/package/@cyanheads/mcp-ts-core)\n\n**Public Hosted Server:** [https://socrata.caseyjhand.com/mcp](https://socrata.caseyjhand.com/mcp)\n\n\u003c/div\u003e\n\n---\n\n## Tools\n\nSix tools covering the full Socrata workflow — portal discovery, dataset search, schema inspection, SoQL querying, and DuckDB-powered analytical SQL over large result sets:\n\n| Tool | Description |\n|:---|:---|\n| `socrata_list_portals` | List known Socrata-powered government open-data portals with domain, organization name, and dataset count |\n| `socrata_find_datasets` | Search for datasets across all Socrata portals or scope to one portal via the Discovery API |\n| `socrata_get_dataset` | Fetch full metadata and typed column schema for a dataset by ID — required before writing SoQL queries |\n| `socrata_query_dataset` | Execute a SoQL query against any dataset: search, select, where, group, having, order, with DataCanvas spillover |\n| `socrata_dataframe_describe` | List registered tables in a DataCanvas session — schema, row count, column names |\n| `socrata_dataframe_query` | Run SELECT-only SQL against DataCanvas tables populated by `socrata_query_dataset` |\n\n### `socrata_list_portals`\n\nList known Socrata-powered government open-data portals.\n\n- Backed by the Discovery API domains catalog — hundreds of city, county, state, and federal portals\n- Client-side substring filtering on domain or organization name\n- Pagination (up to 200 per page) with offset\n- Returns domain (pass to `socrata_find_datasets`), organization name, and dataset count\n- Use this first when you don't know which portal to target\n\n---\n\n### `socrata_find_datasets`\n\nSearch for datasets across all Socrata portals or scope to a single portal.\n\n- Full-text search across dataset names and descriptions\n- Scope to a single portal with the `domain` parameter\n- Filter by category (e.g. `[\"Public Safety\", \"Transportation\"]`) and tags (e.g. `[\"covid19\"]`)\n- Asset type filtering: datasets, maps, files, calendars, stories\n- Sort by relevance, page views, created date, or updated date\n- Pagination (up to 100 per page) with offset\n- Returns dataset IDs, names, abbreviated column previews, domains, and update timestamps\n- Column names here are preview-only — call `socrata_get_dataset` for typed schema before writing queries\n- Recovery hints on empty results — echoes applied filters and suggests how to broaden\n\n---\n\n### `socrata_get_dataset`\n\nFetch full metadata and column schema for a Socrata dataset by ID.\n\n- Returns field names, Socrata data types, descriptions, row count, and licensing\n- Column `data_type` determines correct WHERE clause syntax: `Number` → bare literals (`year=2023`), `Text` → single-quoted strings (`year='2023'`)\n- Excludes computed region columns (`:@computed_region_*`) to reduce noise\n- Per-column non-null row counts when available\n- Always call this before writing a `socrata_query_dataset` query\n\n---\n\n### `socrata_query_dataset`\n\nExecute a SoQL query against any dataset on any Socrata portal.\n\n- `search` parameter for quick full-text lookup across all text columns (`$q`)\n- `select`, `where`, `group`, `having`, `order` for full analytical control\n- SoQL operators: `=`, `!=`, `\u003e`, `\u003c`, `LIKE`, `IN(...)`, `BETWEEN`, `IS NULL`, `starts_with()`, `contains()`, `AND`, `OR`, `NOT`\n- Aggregation: `count(*)`, `sum()`, `avg()`, `min()`, `max()` with `group` and `having`\n- Pagination up to 5000 rows per call with offset; `total_count` returned when result is truncated\n- `assembled_query` in the response echoes the SoQL string for learning the syntax\n- All SODA 2.1 row values are strings — geo/location columns return nested objects\n- When `CANVAS_PROVIDER_TYPE=duckdb` and result hits the limit, rows spill to a DataCanvas table for SQL-based analysis\n\n---\n\n### `socrata_dataframe_describe`\n\nList registered tables in a DataCanvas session.\n\n- Shows table name, row count, and DuckDB-inferred column types for each registered table\n- Only meaningful when `CANVAS_PROVIDER_TYPE=duckdb` is set\n- Use after `socrata_query_dataset` spills a large result set\n- Returns canvas ID for use in `socrata_dataframe_query`\n\n---\n\n### `socrata_dataframe_query`\n\nRun SELECT-only SQL against DataCanvas tables populated by `socrata_query_dataset`.\n\n- DuckDB infers types from spilled data — numeric columns that SODA returned as strings become queryable with numeric comparisons (`year \u003e 2020`, `amount \u003c 500`)\n- SELECT-only enforcement: DDL, DML, and file-reading functions (`read_csv`, `read_parquet`) are rejected\n- Up to 10,000 rows per call\n- Only works when `CANVAS_PROVIDER_TYPE=duckdb` is set\n\n## Resources and prompts\n\n| Type | Name | Description |\n|:---|:---|:---|\n| Resource | `socrata://datasets/{domain}/{datasetId}` | Fetch full metadata and column schema for a dataset by stable URI — same payload as `socrata_get_dataset` |\n| Resource | `socrata://portals` | Paginated list of known Socrata portals with organization name and dataset count |\n| Prompt | `explore_open_data` | Structured six-step civic data investigation workflow: find portal → discover datasets → inspect schema → query → aggregate → synthesize |\n\nAll resource data is also reachable via tools. Use the corresponding tool for agent workflows — resources are for clients that support URI-addressable data.\n\n## Features\n\nBuilt on [`@cyanheads/mcp-ts-core`](https://github.com/cyanheads/mcp-ts-core):\n\n- Declarative tool, resource, and prompt definitions — single file per primitive, framework handles registration and validation\n- Unified error handling — handlers throw, framework catches, classifies, and formats\n- Pluggable auth: `none`, `jwt`, `oauth`\n- Swappable storage backends: `in-memory`, `filesystem`, `Supabase`, `Cloudflare KV/R2/D1`\n- Structured logging with optional OpenTelemetry tracing\n- STDIO and Streamable HTTP transports\n- Optional DataCanvas (DuckDB) for analytical SQL over large result sets\n\nSocrata-specific:\n\n- Full Socrata SODA 2.1 API integration — SoQL query builder with select, where, group, having, order, search, limit, offset\n- Discovery API for cross-portal dataset search and portal catalog\n- App token support (`SOCRATA_APP_TOKEN`) for higher per-IP rate limits\n- Configurable default portal domain via `SOCRATA_DEFAULT_DOMAIN`\n- Computed region column filtering to reduce noise in wide datasets\n- DataCanvas spillover — large query results automatically register as DuckDB tables for SQL analysis\n\nAgent-friendly output:\n\n- Assembled SoQL string echoed in every `socrata_query_dataset` response so agents can learn and refine syntax\n- Recovery hints on empty results — echoes applied filters with specific suggestions for broadening\n- Column type context embedded in schema output with WHERE-clause quoting rules stated explicitly\n- Per-item structured error reasons (`invalid_id`, `not_found`, `soql_error`, `rate_limited`) with actionable recovery text\n\n## Getting started\n\nAdd the following to your MCP client configuration file.\n\n```json\n{\n  \"mcpServers\": {\n    \"socrata\": {\n      \"type\": \"stdio\",\n      \"command\": \"bunx\",\n      \"args\": [\"@cyanheads/socrata-mcp-server@latest\"],\n      \"env\": {\n        \"MCP_TRANSPORT_TYPE\": \"stdio\",\n        \"MCP_LOG_LEVEL\": \"info\"\n      }\n    }\n  }\n}\n```\n\nOr with npx (no Bun required):\n\n```json\n{\n  \"mcpServers\": {\n    \"socrata\": {\n      \"type\": \"stdio\",\n      \"command\": \"npx\",\n      \"args\": [\"-y\", \"@cyanheads/socrata-mcp-server@latest\"],\n      \"env\": {\n        \"MCP_TRANSPORT_TYPE\": \"stdio\",\n        \"MCP_LOG_LEVEL\": \"info\"\n      }\n    }\n  }\n}\n```\n\nOr with Docker:\n\n```json\n{\n  \"mcpServers\": {\n    \"socrata\": {\n      \"type\": \"stdio\",\n      \"command\": \"docker\",\n      \"args\": [\n        \"run\", \"-i\", \"--rm\",\n        \"-e\", \"MCP_TRANSPORT_TYPE=stdio\",\n        \"ghcr.io/cyanheads/socrata-mcp-server:latest\"\n      ]\n    }\n  }\n}\n```\n\nFor Streamable HTTP, set the transport and start the server:\n\n```sh\nMCP_TRANSPORT_TYPE=http MCP_HTTP_PORT=3010 bun run start:http\n# Server listens at http://localhost:3010/mcp\n```\n\n### Prerequisites\n\n- [Bun v1.3.0](https://bun.sh/) or higher (or Node.js v24+).\n- Optional: A Socrata app token — register for free at any portal (e.g. [data.seattle.gov](https://data.seattle.gov)) to get higher rate limits (10 req/s per token vs. shared throttled pool without one).\n\n### Installation\n\n1. **Clone the repository:**\n\n```sh\ngit clone https://github.com/cyanheads/socrata-mcp-server.git\n```\n\n2. **Navigate into the directory:**\n\n```sh\ncd socrata-mcp-server\n```\n\n3. **Install dependencies:**\n\n```sh\nbun install\n```\n\n4. **Configure environment:**\n\n```sh\ncp .env.example .env\n# edit .env and set SOCRATA_APP_TOKEN if you have one\n```\n\n## Configuration\n\nAll configuration is validated at startup via Zod schemas in `src/config/server-config.ts`. Key environment variables:\n\n| Variable | Description | Default |\n|:---|:---|:---|\n| `SOCRATA_APP_TOKEN` | Socrata app token (X-App-Token header). Without a token, requests share a throttled pool per source IP. | — |\n| `SOCRATA_DEFAULT_DOMAIN` | Default portal domain when `domain` is omitted from tool calls. | `data.seattle.gov` |\n| `MCP_TRANSPORT_TYPE` | Transport: `stdio` or `http`. | `stdio` |\n| `MCP_HTTP_PORT` | Port for HTTP server. | `3010` |\n| `MCP_AUTH_MODE` | Auth mode: `none`, `jwt`, or `oauth`. | `none` |\n| `MCP_LOG_LEVEL` | Log level (RFC 5424): `debug`, `info`, `notice`, `warning`, `error`. | `info` |\n| `CANVAS_PROVIDER_TYPE` | Set to `duckdb` to enable DataCanvas spillover for large result sets. | — |\n| `LOGS_DIR` | Directory for log files (Node.js only). | `\u003cproject-root\u003e/logs` |\n| `STORAGE_PROVIDER_TYPE` | Storage backend: `in-memory`, `filesystem`, `supabase`, `cloudflare-kv/r2/d1`. | `in-memory` |\n| `OTEL_ENABLED` | Enable [OpenTelemetry instrumentation](https://github.com/cyanheads/mcp-ts-core/tree/main/docs/telemetry). | `false` |\n\nSee [`.env.example`](./.env.example) for the full list of optional overrides.\n\n## Running the server\n\n### Local development\n\n- **Build and run:**\n\n  ```sh\n  # One-time build\n  bun run rebuild\n\n  # Run the built server\n  bun run start:stdio\n  # or\n  bun run start:http\n  ```\n\n- **Run checks and tests:**\n\n  ```sh\n  bun run devcheck   # Lint, format, typecheck, security audit\n  bun run test       # Vitest test suite\n  ```\n\n### Docker\n\n```sh\ndocker build -t socrata-mcp-server .\ndocker run --rm -e MCP_TRANSPORT_TYPE=http -p 3010:3010 socrata-mcp-server\n```\n\nThe Dockerfile defaults to HTTP transport, stateless session mode, and logs to `/var/log/socrata-mcp-server`. OpenTelemetry peer dependencies are installed by default — build with `--build-arg OTEL_ENABLED=false` to omit them.\n\n## Project structure\n\n| Directory | Purpose |\n|:---|:---|\n| `src/index.ts` | `createApp()` entry point — registers tools, resources, prompts, and inits the Socrata service. |\n| `src/config` | Server-specific environment variable parsing and validation with Zod. |\n| `src/mcp-server/tools` | Tool definitions (`*.tool.ts`). Six tools covering portal listing, dataset search, schema fetch, SoQL query, and DataCanvas SQL. |\n| `src/mcp-server/resources` | Resource definitions (`*.resource.ts`). Dataset metadata and portal catalog resources. |\n| `src/mcp-server/prompts` | Prompt definitions (`*.prompt.ts`). Civic data investigation workflow prompt. |\n| `src/services/socrata` | Socrata service layer — SODA 2.1 API client, Discovery API, query builder, type normalization. |\n| `tests/` | Unit and integration tests mirroring `src/`. |\n\n## Development guide\n\nSee [`CLAUDE.md`](./CLAUDE.md) for development guidelines and architectural rules. The short version:\n\n- Handlers throw, framework catches — no `try/catch` in tool logic\n- Use `ctx.log` for request-scoped logging, `ctx.state` for tenant-scoped storage\n- Call `socrata_get_dataset` before writing WHERE clauses — column `data_type` determines quoting\n- Wrap external API calls: validate raw → normalize to domain type → return output schema; never fabricate missing fields\n\n## Contributing\n\nIssues and pull requests are welcome. Run checks and tests before submitting:\n\n```sh\nbun run devcheck\nbun run test\n```\n\n## License\n\nApache-2.0 — see [LICENSE](LICENSE) for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcyanheads%2Fsocrata-mcp-server","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcyanheads%2Fsocrata-mcp-server","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcyanheads%2Fsocrata-mcp-server/lists"}