{"id":47928310,"url":"https://github.com/statsclaw/statsclaw","last_synced_at":"2026-04-04T07:02:55.292Z","repository":{"id":347882760,"uuid":"1193563602","full_name":"statsclaw/statsclaw","owner":"statsclaw","description":"Paper in, package out. An agent teams framework that turns statistical papers into production-ready packages.","archived":false,"fork":false,"pushed_at":"2026-03-29T21:21:10.000Z","size":488,"stargazers_count":1,"open_issues_count":3,"forks_count":2,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-03-30T00:24:21.330Z","etag":null,"topics":["agent-teams","causal-inference","claude-code","econometrics","monte-carlo","paper-to-package","python-package","r-package","stata","statistics"],"latest_commit_sha":null,"homepage":"https://statsclaw.ai","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/statsclaw.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":"ROADMAP.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-27T11:06:12.000Z","updated_at":"2026-03-28T13:19:15.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/statsclaw/statsclaw","commit_stats":null,"previous_names":["statsclaw/statsclaw"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/statsclaw/statsclaw","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/statsclaw%2Fstatsclaw","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/statsclaw%2Fstatsclaw/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/statsclaw%2Fstatsclaw/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/statsclaw%2Fstatsclaw/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/statsclaw","download_url":"https://codeload.github.com/statsclaw/statsclaw/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/statsclaw%2Fstatsclaw/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31390695,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-04T04:26:24.776Z","status":"ssl_error","status_checked_at":"2026-04-04T04:23:34.147Z","response_time":60,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent-teams","causal-inference","claude-code","econometrics","monte-carlo","paper-to-package","python-package","r-package","stata","statistics"],"created_at":"2026-04-04T07:02:54.215Z","updated_at":"2026-04-04T07:02:55.276Z","avatar_url":"https://github.com/statsclaw.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# StatsClaw\n\n**A workflow framework for statistical package development.**\n\n**An open-source tool that helps researchers build, test, and document statistical software packages with AI agent teams.**\n\n[Website](https://statsclaw.ai) · [Roadmap](ROADMAP.md) · [Contributing](CONTRIBUTING.md) · [Discussions](https://github.com/statsclaw/statsclaw/discussions)\n\n---\n\n## What is StatsClaw?\n\nStatsClaw is a framework for [Claude Code](https://claude.ai/code) that uses **AI agent teams** to assist with statistical package development. You describe what you need — a bug fix, a new feature, a cross-language translation — and StatsClaw coordinates multiple AI agents to help you build, test, and document the result. It works best when a domain expert stays in the loop to guide decisions.\n\n---\n\n## How It Works\n\nStatsClaw orchestrates a team of **9 specialized AI agents**, each operating under strict information isolation:\n\n| Agent | Role |\n|:------|:-----|\n| **Leader** | Orchestrates the workflow, dispatches agents, enforces isolation |\n| **Planner** | Reads your paper/formulas, executes deep comprehension protocol, produces specifications |\n| **Builder** | Writes source code from `spec.md` (never sees the test spec) |\n| **Tester** | Validates independently from `test-spec.md` (never sees the code spec) |\n| **Simulator** | Runs Monte Carlo studies from `sim-spec.md` (never sees either spec) |\n| **Scriber** | Documents architecture, generates tutorials, maintains audit trail |\n| **Distiller** | Extracts reusable knowledge for the shared brain (brain mode only) |\n| **Reviewer** | Cross-checks all pipelines, audits tolerance integrity, issues ship/no-ship verdict |\n| **Shipper** | Commits, pushes, opens PRs, handles package distribution |\n\nThe **code**, **test**, and **simulation** pipelines are fully isolated — they never see each other's specs. If all pipelines converge independently, confidence in correctness is high. This is **adversarial verification by design**.\n\n---\n\n## Multi-Pipeline Architecture\n\n```\n                      planner (bridge)\n                     /    |          \\\n          spec.md   / test-spec.md    \\  sim-spec.md\n                   /      |            \\\n            builder ─ ─(parallel)─ ─ simulator\n       (code pipeline)    |    (simulation pipeline)\n                   \\      |            /\n      implementation.md   |   simulation.md\n                    \\     |          /\n                     \\    v         /\n                       tester           \u003c-- sequential, after merge-back\n                    (test pipeline)\n                         |\n                      audit.md\n                         |\n                    scriber (recording)\n                         |\n                    distiller (brain mode only)\n                         |\n                    reviewer (convergence)\n                          |\n                        shipper\n```\n\n**Key properties:**\n\n- **Planner is always mandatory** — it bridges all pipelines\n- **Builder handles code, scriber handles docs, simulator handles Monte Carlo studies** — for docs-only requests, scriber replaces builder as implementer\n- **Builder and simulator run in parallel** (simulation workflows), then **tester validates the merged result** — each pipeline has its own isolated spec\n- **Pipeline isolation is enforced** — each pipeline never sees another's spec\n- **Adversarial verification** — if all pipelines converge independently, confidence is high\n\n---\n\n## Supported Languages\n\n| R | Python | Stata | TypeScript | Go | Rust | C | C++ |\n|:-:|:------:|:-----:|:----------:|:--:|:----:|:-:|:---:|\n\nMore languages coming — [Julia is next](https://github.com/statsclaw/statsclaw/issues/3)! Want another? [Let us know](https://github.com/statsclaw/statsclaw/issues/new?template=feature-request.yml).\n\n---\n\n## Quick Start\n\n### Prerequisites\n\n1. **Claude Code** — [Install Claude Code](https://claude.ai/code)\n2. **GitHub access** — Push access to your target repository\n3. **Workspace repo** — A GitHub repo for storing workflow artifacts (auto-created if needed)\n\n### Your First Task\n\nJust tell StatsClaw what you want. It auto-detects the language, selects the right workflow, and starts working:\n\n```\nwork on https://github.com/your-org/your-package resolve the issues\n```\n\nStatsClaw will auto-detect the language, select a workflow, and start working. It will ask you clarification questions when it encounters ambiguity — your domain expertise guides the process. Results vary depending on task complexity; expect to iterate.\n\n---\n\n## Workflow\n\n```text\nCode:            leader → planner → builder → tester → scriber → [distiller]? → reviewer → shipper?\nDocs-only:       leader → planner → scriber → reviewer → shipper?\nSimulation+Code: leader → planner → [builder ∥ simulator] → tester → scriber → [distiller]? → reviewer → shipper?\nSimulation-only: leader → planner → simulator → tester → scriber → [distiller]? → reviewer → shipper?\n```\n\nStates: `CREDENTIALS_VERIFIED → NEW → PLANNED → SPEC_READY → PIPELINES_COMPLETE → DOCUMENTED → [KNOWLEDGE_EXTRACTED]? → REVIEW_PASSED → READY_TO_SHIP → DONE`\n\nSignals: `HOLD` (ambiguous, ask user), `BLOCK` (validation failed), `STOP` (unsafe to ship)\n\n---\n\n## What Can StatsClaw Help With?\n\n| Task | How it helps | Limitations |\n|:-----|:-------------|:------------|\n| **Implementing methods** | Assists with translating specs into code | Requires researcher to validate mathematical correctness |\n| **Cross-language translation** | Handles R/Python idiom differences | May miss subtle numerical edge cases without careful review |\n| **Testing \u0026 validation** | Independent test pipeline catches bugs tests miss | Empirical verification, not formal proofs |\n| **Monte Carlo studies** | Automates simulation harness and reporting | Researcher must design meaningful DGPs and metrics |\n| **Paper-driven features** | Reads methodology papers to design new functionality | Extracts concepts, not full estimator implementations |\n| **Bug fixing** | Adversarial architecture helps find hidden bugs | Complex domain bugs still need human insight |\n| **Documentation** | Generates Quarto books, API docs | Needs researcher review for accuracy |\n\n---\n\n## Example Prompts\n\n```\n# Fix a specific issue\nfix issue #42 in my-package\n\n# Build from scratch\nbuild a Python package from this R code\n\n# Cross-language migration\nrewrite the Python backends in pure R and ship it\n\n# Simulation study\nrun a Monte Carlo study comparing these three estimators\n\n# Paper to package\nbuild the R works from this PDF\n\n# Paper-driven feature\nread Correia (2016) and add network visualization to panelView\n\n# Documentation\nupdate the documentation for v2.0\n\n# Contribute knowledge to the shared brain\n/contribute\n```\n\n---\n\n## Learn by Example\n\nWe provide examples from our own usage. Each is a real repository you can inspect and learn from. Your mileage may vary — these represent what worked for us with active researcher involvement.\n\n| Example | Repo | What it demonstrates |\n|:--------|:-----|:---------------------|\n| Iterative refactoring (1 to 2) | [`statsclaw/example-fect`](https://github.com/statsclaw/example-fect) | Multi-day, researcher-guided refactoring of an R package |\n| Python from R source (0 to 1) | [`statsclaw/example-R2PY`](https://github.com/statsclaw/example-R2PY) | Building a Python package from an R reference |\n| Paper to package + Monte Carlo | [`statsclaw/example-probit`](https://github.com/statsclaw/example-probit) | PDF manuscript to R/C++ package + simulation |\n| Paper-driven feature addition | [`statsclaw/example-panelView`](https://github.com/statsclaw/example-panelView) | Reading a methodology paper to design a new feature |\n\nSee the [workspace example](https://github.com/statsclaw/example-workspace) for the actual workflow artifacts produced during these examples.\n\n---\n\n## What You Install\n\n- `CLAUDE.md` — orchestration policy (the authoritative reference)\n- `agents/` — agent definitions (leader, planner, builder, tester, simulator, scriber, distiller, reviewer, shipper)\n- `skills/` — shared protocol skills (credential-setup, isolation, handoff, mailbox, issue-patrol, profile-detection, brain-sync, privacy-scrub)\n- `profiles/` — language-specific execution rules (R, Python, TypeScript, Stata, Go, Rust, C, C++)\n- `templates/` — runtime artifact templates and repo scaffolding (brain-repo, brain-seedbank-repo)\n\nAgent Teams is enabled at the project level through `.claude/settings.json`.\n\n---\n\n## Runtime Layout\n\nAll runtime state lives inside the workspace repo, organized per target repository:\n\n```text\n.repos/\n├── \u003ctarget-repo\u003e/                    # target repo checkout\n├── brain/                            # statsclaw/brain clone (brain mode only)\n├── brain-seedbank/                   # statsclaw/brain-seedbank clone (brain mode only)\n└── workspace/                        # workspace repo (GitHub)\n    └── \u003crepo-name\u003e/                  # per-target-repo runtime + logs\n        ├── context.md                # active project context\n        ├── CHANGELOG.md              # timeline index of all runs (pushed)\n        ├── HANDOFF.md                # active handoff (pushed)\n        ├── ref/                      # reference docs for future work (pushed)\n        ├── runs/\n        │   └── \u003crequest-id\u003e/         # per-run artifacts\n        │       ├── credentials.md    # push access verification\n        │       ├── request.md        # scope and acceptance criteria\n        │       ├── status.md         # state machine\n        │       ├── impact.md         # affected files and risk areas\n        │       ├── comprehension.md  # comprehension verification (from planner)\n        │       ├── spec.md           # code pipeline input (from planner)\n        │       ├── test-spec.md      # test pipeline input (from planner)\n        │       ├── sim-spec.md       # simulation pipeline input (from planner, workflows 11/12)\n        │       ├── implementation.md # code pipeline output (from builder)\n        │       ├── simulation.md     # simulation pipeline output (from simulator, workflows 11/12)\n        │       ├── audit.md          # test pipeline output (from tester)\n        │       ├── ARCHITECTURE.md   # from scriber (primary copy in target repo root)\n        │       ├── log-entry.md      # process record (from scriber; promoted to runs/\u003cdate\u003e-\u003cslug\u003e.md)\n        │       ├── docs.md           # documentation changes (from scriber)\n        │       ├── brain-contributions.md  # knowledge entries (from distiller, brain mode only)\n        │       ├── review.md         # convergence verdict (from reviewer)\n        │       ├── shipper.md        # ship actions (from shipper)\n        │       ├── mailbox.md        # inter-teammate communication\n        │       └── locks/            # write surface locks\n        ├── logs/                     # diagnostic logs\n        └── tmp/                      # transient data\n```\n\n---\n\n## Repository Layout\n\n```text\nStatsClaw/\n├── CLAUDE.md           # orchestration policy\n├── README.md\n├── agents/             # agent definitions (9 agents including distiller)\n├── skills/             # shared protocol skills (13 skills including brain-sync, privacy-scrub)\n├── profiles/           # language execution rules (8 languages)\n├── templates/          # runtime artifact templates + repo scaffolding (brain-repo, brain-seedbank-repo)\n└── .repos/             # target repo checkouts + workspace + brain repos (runtime state, git-ignored)\n```\n\n---\n\n## Workspace Repository\n\nWorkflow logs, process records, and handoff documents are NOT stored in target repos. Instead, they are synced to a user-specified **workspace repository** on GitHub (e.g., `[username]/workspace`):\n\n```text\nworkspace/\n├── fect/\n│   ├── CHANGELOG.md                # timeline index\n│   ├── HANDOFF.md                  # active handoff\n│   ├── ref/                        # reference docs for future work\n│   │   └── cv-comparison-table.md\n│   └── runs/                       # individual workflow logs\n│       ├── 2026-03-16-cv-unification.md\n│       └── 2026-03-17-convergence-conditioning.md\n├── panelview/\n│   ├── CHANGELOG.md\n│   ├── HANDOFF.md\n│   ├── ref/\n│   └── runs/\n│       └── 2026-03-17-add-feature.md\n└── README.md\n```\n\nThis keeps target repos clean (code + essential docs only) while preserving full traceability in one place.\n\n---\n\n## Shared Brain\n\nStatsClaw has a shared knowledge system where techniques discovered during workflows — mathematical methods, coding patterns, validation strategies, simulation designs — are extracted, privacy-scrubbed, and contributed to a collective knowledge base. When you enable Brain mode, your agents get smarter by reading knowledge contributed by all users.\n\n**How it works:**\n\n1. **Read** — Your agents automatically access relevant knowledge entries from [`statsclaw/brain`](https://github.com/statsclaw/brain)\n2. **Contribute** — After noteworthy workflows, the distiller agent extracts reusable knowledge. You review everything and approve or decline — nothing is shared without your explicit consent. You can also run the built-in `/contribute` command at any time to summarize what you learned — what worked, what required manual intervention, and what domain-specific patterns emerged — and submit it as a structured report\n3. **Earn badges** — Accepted contributions earn virtual badges on the [Contributors leaderboard](https://github.com/statsclaw/brain/blob/main/CONTRIBUTORS.md)\n\n**Privacy guarantee:** All contributions are automatically scrubbed of repo names, file paths, usernames, proprietary code, and any identifying information. Only generic, reusable knowledge is shared.\n\n| Repo | Purpose |\n|:-----|:--------|\n| [`statsclaw/brain`](https://github.com/statsclaw/brain) | Curated knowledge — agents read from here |\n| [`statsclaw/brain-seedbank`](https://github.com/statsclaw/brain-seedbank) | Contribution staging — users submit PRs here |\n\nBrain mode is optional — you choose at session start. See [Brain System Documentation](.github/BRAIN.md) for full details.\n\n---\n\n## Design Principles\n\n- **Credentials first, work second.** Verify push access before creating a run.\n- **Team Leader dispatches, never does.** Leader plans and coordinates; teammates do the work.\n- **Multi-pipeline, fully isolated.** Code, test, and simulation pipelines never see each other's specs.\n- **Planner first, always.** Every non-trivial request starts with dual-spec production.\n- **Adversarial verification by design.** Independent convergence proves correctness.\n- **Hard gates, not soft advice.** State transitions have preconditions; artifacts are verified.\n- **Worktree isolation for writers.** Builder, simulator, and scriber run in isolated git worktrees.\n- **Surgical scope.** Each run modifies only what the request requires.\n- **Explicit ship actions.** Nothing is pushed without user instruction or active patrol skill.\n- **Collective knowledge, individual consent.** Brain mode lets agents learn from all users, but nothing is shared without explicit per-workflow approval.\n\n---\n\n## Citation\n\nIf you use StatsClaw in your research or software development, please cite our paper:\n\n\u003e Qin, Tianzhu and Yiqing Xu. 2026. \"[StatsClaw: An AI-Collaborative Workflow for Statistical Software Development](https://bit.ly/statsclaw).\"\n\nBibTeX:\n\n```bibtex\n@misc{qinxu2026statsclaw,\n  title={StatsClaw: An AI-Collaborative Workflow for Statistical Software Development},\n  author={Qin, Tianzhu and Xu, Yiqing},\n  year={2026},\n  howpublished = {Mimeo, Stanford University},\n  url={https://bit.ly/statsclaw}\n}\n```\n\n---\n\n## License\n\nStatsClaw is released under the [MIT License](LICENSE).\n\n---\n\n## Get Involved\n\nWe are building StatsClaw in the open. Everyone is welcome.\n\n- **Share an idea** — [Discussions](https://github.com/statsclaw/statsclaw/discussions/categories/ideas)\n- **Report a bug** — [Bug report](https://github.com/statsclaw/statsclaw/issues/new?template=bug-report.yml)\n- **Contribute code** — [Contributing guide](CONTRIBUTING.md)\n- **Contribute knowledge** — Enable Brain mode and your discoveries help everyone. [Learn more](.github/BRAIN.md)\n- **See what is planned** — [Roadmap](ROADMAP.md)\n\n---\n\n**[statsclaw.ai](https://statsclaw.ai)**\n\n*A tool for statisticians and econometricians. Works best with an expert in the loop.*\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstatsclaw%2Fstatsclaw","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fstatsclaw%2Fstatsclaw","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstatsclaw%2Fstatsclaw/lists"}