{"id":50948822,"url":"https://github.com/oduerr/stan-mcp-server","last_synced_at":"2026-06-17T23:04:32.482Z","repository":{"id":350624669,"uuid":"1207637748","full_name":"oduerr/stan-mcp-server","owner":"oduerr","description":null,"archived":false,"fork":false,"pushed_at":"2026-04-23T11:44:01.000Z","size":43,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-23T13:27:12.288Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/oduerr.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-11T07:34:21.000Z","updated_at":"2026-04-23T11:44:05.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/oduerr/stan-mcp-server","commit_stats":null,"previous_names":["oduerr/stan-mcp-server"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/oduerr/stan-mcp-server","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oduerr%2Fstan-mcp-server","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oduerr%2Fstan-mcp-server/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oduerr%2Fstan-mcp-server/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oduerr%2Fstan-mcp-server/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/oduerr","download_url":"https://codeload.github.com/oduerr/stan-mcp-server/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oduerr%2Fstan-mcp-server/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34468785,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-17T02:00:05.408Z","response_time":127,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-06-17T23:04:31.823Z","updated_at":"2026-06-17T23:04:32.476Z","avatar_url":"https://github.com/oduerr.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Stan MCP Server\n\nA standalone MCP server that gives an LLM agent structured access to\nCmdStan/CmdStanPy over HTTP.  The agent receives compact JSON — never raw\nsampler output.\n\nLarge datasets are uploaded directly from the client to the server's HTTP\nupload endpoint, so CSV content never passes through LLM context.\n\n## Quick start\n\n```bash\n# 1. Install (requires uv — https://docs.astral.sh/uv/getting-started/installation/)\nuv pip install -e .\n\n# 2. Install CmdStan (once)\npython -c \"import cmdstanpy; cmdstanpy.install_cmdstan()\"\n\n# 3. Start the server\nstan-mcp-server \\\n  --datasets-dir datasets \\\n  --results-dir  results\n```\n\nThe MCP server listens at `http://127.0.0.1:8765/mcp` and the HTTP upload\nendpoint is at `http://127.0.0.1:8766/dataset/{name}` by default.\n\n## Prerequisites\n\n- Python ≥ 3.10\n- [uv](https://docs.astral.sh/uv/getting-started/installation/) (`curl -LsSf https://astral.sh/uv/install.sh | sh`)\n- CmdStan (installed via the step above)\n\n## Installation\n\n```bash\n# Recommended — uv (fast, isolated)\nuv pip install -e .\n\n# Alternative — plain pip\npip install -e .\n```\n\n## Running the server\n\n```bash\nstan-mcp-server \\\n  --datasets-dir /path/to/datasets \\\n  --results-dir  /path/to/results \\\n  --host 127.0.0.1 \\     # default\n  --port 8765 \\          # MCP endpoint (default)\n  --upload-port 8766     # HTTP upload endpoint (default; 0 to disable)\n```\n\nThis is could also be started with (after activating the `uv` with `source .venv/bin/activate`\n\n```\npython stan_mcp_server/server.py \\\n  --datasets-dir datasets \\\n  --results-dir results\n```\n\n\n## Tools\n\n| Tool | Purpose |\n|------|---------|\n| `get_capabilities` | Query available tools, server configuration, and upload URL |\n| `list_datasets` | List pre-staged and uploaded datasets |\n| `get_data_summary` | Compact EDA for a named dataset (includes `tier` and `has_test`) |\n| `check_model` | Compile-only check (syntax + `log_lik` presence) |\n| `fit_and_evaluate` | Sample + compute NLPD on held-out test; pre-staged datasets only |\n| `sample` | Sample; returns scalar diagnostics + run asset paths |\n| `get_upload_instructions` | Return HTTP upload URL and field names for datasets |\n| `get_run_history` | Return the logged NLPD history for a dataset |\n\n**Recommended call order:**\n`get_capabilities` → `list_datasets` → `get_data_summary` → `check_model` →\n- **Pre-staged dataset** (`tier: staged`): `fit_and_evaluate` → `get_run_history`\n- **Uploaded dataset** (`tier: uploaded`): `sample` → compute PSIS-LOO yourself\n\n## Run assets — logs and posterior draws\n\nEvery `sample` and `fit_and_evaluate` call persists results under a short\n`run_id` and returns only scalar diagnostics plus filesystem paths.  Bulk data\n**never enters LLM context**.\n\n```json\n{\n  \"run_id\":        \"3a7f9c1e20b4\",\n  \"nlpd\":          1.423,\n  \"r_hat_max\":     1.003,\n  \"n_divergences\": 0,\n  \"ess_bulk_min\":  2841,\n  \"runtime_sec\":   4.2,\n  \"logs_path\":     \"/path/to/results/_runs/3a7f9c1e20b4/logs.txt\",\n  \"samples_path\":  \"/path/to/results/_runs/3a7f9c1e20b4\"\n}\n```\n\n`samples_path` is a directory containing one Stan CSV per chain.  Load them\ndirectly (requires `arviz`):\n\n```python\nimport glob, arviz as az\ncsvs = sorted(glob.glob(\"/path/to/results/_runs/3a7f9c1e20b4/samples/*.csv\"))\nidata = az.from_cmdstan(csvs)\n```\n\nRun assets are stored under `\u003cresults-dir\u003e/_runs/\u003crun_id\u003e/` and are never\nautomatically deleted.\n\n## Uploading datasets at runtime\n\nThe HTTP upload endpoint accepts **training data only**.  Test data must be\nplaced manually by the server operator — it never passes through the agent or\nHTTP layer.  This is a deliberate security boundary: the agent cannot see\nheld-out labels even in principle.\n\nThe LLM calls `get_upload_instructions()` to retrieve the URL and field names,\nthen passes them to the user or an automated client.\n\n### Two-tier dataset system\n\n| Tier | How created | `fit_and_evaluate` | Suggested evaluation |\n|------|-------------|---------------------|----------------------|\n| **staged** | Server operator places `train.csv` + `protected/test.csv` | ✅ real held-out NLPD | `fit_and_evaluate` |\n| **uploaded** | Agent/user uploads via HTTP (train only) | ❌ blocked | `sample` + PSIS-LOO |\n\n`get_data_summary` returns `tier` and `has_test` so the agent knows which\npath to follow before writing any Stan code.\n\n### HTTP upload endpoint\n\n```bash\ncurl -X POST http://127.0.0.1:8766/dataset/my_experiment \\\n     -F train=@train.csv \\\n     -F dataset_md=@dataset.md   # optional\n```\n\nOr from Python:\n\n```python\nimport requests\n\nwith open(\"train.csv\") as tr:\n    r = requests.post(\n        \"http://127.0.0.1:8766/dataset/my_experiment\",\n        files={\"train\": tr},\n    )\nr.raise_for_status()\nprint(r.json())   # {\"status\": \"ok\", \"tier\": \"uploaded\", \"dataset\": \"_uploaded/my_experiment\", ...}\n```\n\nAfter a successful upload pass `_uploaded/my_experiment` to `sample` /\n`get_data_summary`.  To enable `fit_and_evaluate`, place test data at\n`\u003cdatasets_dir\u003e/_uploaded/my_experiment/protected/test.csv` manually.\n\nTo disable the HTTP endpoint entirely:\n\n```bash\nstan-mcp-server --datasets-dir ... --results-dir ... --upload-port 0\n```\n\nUploaded datasets are stored under `\u003cdatasets-dir\u003e/_uploaded/` on the server.\nDataset names may only contain letters, digits, underscores, and hyphens.\n\n## Dataset layout\n\nDatasets live under `--datasets-dir` in two areas:\n\n```\ndatasets/\n  benchmarks/             ← pre-staged benchmark datasets (operator-managed)\n    regression_1d/\n      train.csv           ← training features + response\n      dataset.md          ← description + ## Data Interface block\n      protected/\n        test.csv          ← held-out test features + response (operator-placed)\n  _uploaded/              ← agent-uploaded, train-only datasets\n    my_experiment/\n      train.csv\n      dataset.md          ← optional\n```\n\nThe dataset name passed to tools is the path relative to `--datasets-dir`,\ne.g. `benchmarks/regression_1d` or `_uploaded/my_experiment`.\n\nThe `protected/test.csv` file is what makes a dataset \"staged\" and enables\n`fit_and_evaluate`.  Uploaded datasets lack this file and are limited to\n`sample` + PSIS-LOO.\n\nThe LLM only needs to pass the dataset name — the server loads data\nautomatically:\n\n```python\nfit_and_evaluate(stan_code=..., dataset=\"benchmarks/regression_1d\", notes=\"...\", rationale=\"...\")\n```\n\n`N_train` and `N_test` are injected automatically from the CSV row counts.\nOnly pass the `data` parameter when you need to override them or supply\nadditional scalars the CSV does not provide.\n\n### dataset.md convention\n\nThe `## Data Interface` section must contain a Stan-style code block\ndeclaring all `_train` variables.  Stan base names must match CSV column\nnames exactly (the `_train` / `_test` suffix is appended automatically):\n\n```stan\nint\u003clower=0\u003e N_train;\nint\u003clower=0\u003e N_test;\nvector[N_train] x_train;\nvector[N_train] y_train;\nvector[N_test]  x_test;\nvector[N_test]  y_test;\n```\n\nFor datasets with a grouping variable (`J`) declare it as:\n\n```stan\nint\u003clower=0\u003e J;\narray[N_train] int\u003clower=1,upper=J\u003e group_train;\n```\n\nThe last CSV column is assumed to be the response unless `response_col: \u003cname\u003e`\nappears anywhere in `dataset.md`.\n\n## Model contract\n\nEvery Stan model used with `fit_and_evaluate` must output a `log_lik`\nvector of length `N_test` in `generated quantities`:\n\n```stan\ngenerated quantities {\n    vector[N_test] log_lik;\n    for (i in 1:N_test)\n        log_lik[i] = normal_lpdf(y_test[i] | mu[i], sigma);\n}\n```\n\n## Compilation cache\n\nCompiled Stan binaries are stored in a temp directory keyed by the\nSHA-256 of the model source.  Identical model code is never recompiled.\n\n## Remote deployment\n\nThe recommended pattern for running the server on a remote machine (e.g. a\nGPU workstation or cloud VM accessible via VPN):\n\n### 1. Start the server on the remote machine\n\n```bash\nstan-mcp-server \\\n  --host 127.0.0.1 \\           # keep MCP port local; SSH tunnel handles access\n  --datasets-dir /data/datasets \\\n  --results-dir  /data/results \\\n  --token $(openssl rand -hex 32)   # save this token\n```\n\n### 2. Tunnel the MCP port via SSH\n\n```bash\nssh -N -L 8765:127.0.0.1:8765 user@remote-host\n```\n\nThe MCP endpoint is now reachable at `http://127.0.0.1:8765/mcp` on your\nlocal machine.\n\n### 3. Mount the results directory via SSHFS\n\n```bash\nmkdir -p ~/mnt/stan-results\nsshfs user@remote-host:/data/results ~/mnt/stan-results\n```\n\nBecause tool responses return `logs_path` / `samples_path` as absolute paths\nunder `--results-dir`, and the mount makes those paths locally accessible,\nthe agent can read logs and samples directly.\n\n### 4. Connect from Claude Desktop\n\n```json\n{\n  \"mcpServers\": {\n    \"stan\": {\n      \"url\": \"http://127.0.0.1:8765/mcp\",\n      \"headers\": { \"Authorization\": \"Bearer \u003cyour-token\u003e\" }\n    }\n  }\n}\n```\n\n\u003e **Note:** The HTTP download endpoints (`GET /logs/{run_id}`, `GET /samples/{run_id}`)\n\u003e have been removed. Access run assets directly via the SSHFS-mounted `results_dir`\n\u003e using the `logs_path` / `samples_path` returned in tool responses.\n\n## Security\n\nFor remote deployments (i.e. `--host 0.0.0.0`) protect the server with a\nbearer token using the built-in `--token` flag.\n\n### 1. Generate a token\n\n```bash\nopenssl rand -hex 32\n# e.g. a3f8c2d1e4b5...\n```\n\n### 2. Start the server with the token\n\n```bash\nstan-mcp-server \\\n  --host 0.0.0.0 \\\n  --datasets-dir /path/to/datasets \\\n  --results-dir  /path/to/results \\\n  --token a3f8c2d1e4b5...\n```\n\nAlternatively, set the environment variable `STAN_MCP_TOKEN` instead of\npassing `--token` on the command line (useful for keeping secrets out of\nshell history):\n\n```bash\nexport STAN_MCP_TOKEN=a3f8c2d1e4b5...\nstan-mcp-server --host 0.0.0.0 --datasets-dir ... --results-dir ...\n```\n\nBoth the MCP endpoint (port 8765) and the HTTP upload endpoint (port 8766)\nrequire the token.  Requests without a valid `Authorization: Bearer \u003ctoken\u003e`\nheader receive `401 Unauthorized`.\n\n### 3. Connect from Claude Desktop\n\n```json\n{\n  \"mcpServers\": {\n    \"stan\": {\n      \"url\": \"http://\u003cserver-ip\u003e:8765/mcp\",\n      \"headers\": { \"Authorization\": \"Bearer a3f8c2d1e4b5...\" }\n    }\n  }\n}\n```\n\n### 4. Upload datasets with the token\n\n```bash\ncurl -X POST http://\u003cserver-ip\u003e:8766/dataset/my_experiment \\\n     -H \"Authorization: Bearer a3f8c2d1e4b5...\" \\\n     -F train=@train.csv\n```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foduerr%2Fstan-mcp-server","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Foduerr%2Fstan-mcp-server","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foduerr%2Fstan-mcp-server/lists"}