{"id":48963844,"url":"https://github.com/paperfoot/autoresearch-cli","last_synced_at":"2026-04-18T03:04:37.760Z","repository":{"id":346258291,"uuid":"1189067229","full_name":"paperfoot/autoresearch-cli","owner":"paperfoot","description":"Autonomous AI experiment loop CLI -- run research overnight with any coding agent","archived":false,"fork":false,"pushed_at":"2026-04-17T20:02:59.000Z","size":3851,"stargazers_count":15,"open_issues_count":0,"forks_count":2,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-04-17T22:29:22.484Z","etag":null,"topics":["ai-coding-agent","ai-experiments","ai-tools","automation","autonomous-research","autoresearch","claude-code","cli","codex","cursor","developer-tools","experiment-loop","experiment-tracking","karpathy","machine-learning","research-automation","rust-cli","windsurf"],"latest_commit_sha":null,"homepage":null,"language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/paperfoot.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-23T00:03:13.000Z","updated_at":"2026-04-17T20:03:03.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/paperfoot/autoresearch-cli","commit_stats":null,"previous_names":["199-biotechnologies/autoresearch-cli","paperfoot/autoresearch-cli"],"tags_count":10,"template":false,"template_full_name":null,"purl":"pkg:github/paperfoot/autoresearch-cli","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paperfoot%2Fautoresearch-cli","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paperfoot%2Fautoresearch-cli/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paperfoot%2Fautoresearch-cli/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paperfoot%2Fautoresearch-cli/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/paperfoot","download_url":"https://codeload.github.com/paperfoot/autoresearch-cli/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/paperfoot%2Fautoresearch-cli/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31954738,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-18T00:39:45.007Z","status":"online","status_checked_at":"2026-04-18T02:00:07.018Z","response_time":103,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-coding-agent","ai-experiments","ai-tools","automation","autonomous-research","autoresearch","claude-code","cli","codex","cursor","developer-tools","experiment-loop","experiment-tracking","karpathy","machine-learning","research-automation","rust-cli","windsurf"],"created_at":"2026-04-18T03:04:31.784Z","updated_at":"2026-04-18T03:04:37.751Z","avatar_url":"https://github.com/paperfoot.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\n# Autoresearch CLI\n\n**Run autonomous AI experiments while you sleep. Wake up to results.**\n\n\u003cbr /\u003e\n\n[![Star this repo](https://img.shields.io/github/stars/paperfoot/autoresearch-cli?style=for-the-badge\u0026logo=github\u0026label=%E2%AD%90%20Star%20this%20repo\u0026color=yellow)](https://github.com/paperfoot/autoresearch-cli/stargazers)\n\u0026nbsp;\u0026nbsp;\n[![Follow @longevityboris](https://img.shields.io/badge/Follow_%40longevityboris-000000?style=for-the-badge\u0026logo=x\u0026logoColor=white)](https://x.com/longevityboris)\n\n\u003cbr /\u003e\n\n[![Crates.io](https://img.shields.io/crates/v/autoresearch?style=for-the-badge\u0026logo=rust\u0026logoColor=white\u0026label=crates.io)](https://crates.io/crates/autoresearch)\n[![Downloads](https://img.shields.io/crates/d/autoresearch?style=for-the-badge\u0026logo=rust\u0026logoColor=white)](https://crates.io/crates/autoresearch)\n[![License: MIT](https://img.shields.io/badge/License-MIT-blue?style=for-the-badge)](LICENSE)\n[![CI](https://img.shields.io/github/actions/workflow/status/paperfoot/autoresearch-cli/release.yml?style=for-the-badge\u0026logo=github\u0026label=CI)](https://github.com/paperfoot/autoresearch-cli/actions)\n\n---\n\nA single Rust binary that turns any AI coding agent into an autonomous research machine. Define one file to modify, one metric to optimize, and one eval command. Your agent handles the rest -- running experiments, tracking results, keeping winners, reverting losers. You sleep. It works.\n\n[Install](#install) | [How It Works](#how-it-works) | [Features](#features) | [Contributing](#contributing)\n\n\u003c/div\u003e\n\n## Why Autonomous Research Matters\n\nKarpathy's [autoresearch](https://github.com/karpathy/autoresearch) ran 126 ML experiments overnight on a single GPU. Since then, people have applied the same pattern to [chess engines](https://x.com/MindCanvasx/status/2035817965614940472) (expert to grandmaster), [Bitcoin modeling](https://x.com/xmal/status/2035735100516634805) (halved prediction errors), [Sudoku solvers](https://x.com/VihariKanukollu/status/2035411680050778435) (beat the paper in 5 minutes), and [running 400B models on laptops](https://x.com/DAIEvolutionHub/status/2034899294709588419).\n\nThe pattern is simple: **one file to modify, one metric to optimize, one loop that never stops.**\n\nBut every project reimplements this from scratch -- copying `program.md`, figuring out the eval, hand-writing JSONL logs. This CLI makes the autonomous experiment loop a `cargo install` away. It works with any AI coding agent that can shell out to a command.\n\n## Install\n\n**One-liner (macOS / Linux):**\n```bash\ncurl -LsSf https://github.com/paperfoot/autoresearch-cli/releases/latest/download/autoresearch-installer.sh | sh\n```\n\n**Homebrew:**\n```bash\nbrew tap 199-biotechnologies/tap\nbrew install autoresearch\n```\n\n**Cargo (from [crates.io](https://crates.io/crates/autoresearch)):**\n```bash\ncargo install autoresearch\n```\n\nBinary size: ~1.1 MB. Startup: ~2 ms. Memory: ~3 MB. No Python, no Node, no Docker.\n\n## Quick Start\n\n```bash\n# 1. Install the research skill into your AI agent\nautoresearch install claude-code    # or: all, codex, cursor, windsurf, opencode\n\n# 2. Set up your project for experiment tracking\nautoresearch init \\\n  --target-file train.py \\\n  --eval-command \"python train.py\" \\\n  --metric-name val_bpb \\\n  --metric-direction lower \\\n  --time-budget 5m\n\n# 3. Validate everything is ready\nautoresearch doctor\n\n# 4. Tell your agent to start\n/autoresearch\n\n# 5. Go to sleep. Wake up to results.\nautoresearch status\nautoresearch best\nautoresearch report\n```\n\n## How It Works\n\n```\nYou write program.md          Your AI agent runs the loop\n     ┌──────────┐          ┌──────────────────────┐\n     │  Ideas   │          │  1. Read program.md   │\n     │  Papers  │ ───────► │  2. Modify target     │\n     │  Goals   │          │  3. Commit            │\n     └──────────┘          │  4. Eval (timeout)    │\n                           │  5. Keep or revert    │\nautoresearch.toml           │  6. autoresearch      │\n     ┌──────────┐          │     record --metric   │\n     │ target   │ ───────► │  7. Repeat forever    │\n     │ eval_cmd │          └──────────────────────┘\n     │ metric   │                    │\n     └──────────┘                    ▼\n                           .autoresearch/\n                           experiments.jsonl\n                           ┌─────────────────────┐\n                           │ run 0: 1.050 baseline│\n                           │ run 1: 1.042 kept    │\n                           │ run 2: 1.055 discard │\n                           │ run 3: 1.031 kept    │\n                           └─────────────────────┘\n```\n\nThe CLI handles everything except the loop itself:\n- **Scaffolding** -- `init` creates the config and research prompt\n- **Validation** -- `doctor` runs 14 pre-flight checks before you start\n- **State management** -- `record` handles JSONL atomically (agents never hand-write JSON)\n- **Experiment tracking** -- `log`, `best`, `diff`, `status` parse results from git + JSONL\n- **Reporting** -- `report` generates a shareable markdown summary\n\nYour agent handles the creative work -- deciding what to try, implementing changes, interpreting results.\n\n## Features\n\n### AI Coding Agent Support\n\nWorks with every major AI coding agent out of the box:\n\n| Agent | Install command | Slash command |\n|-------|----------------|:---:|\n| [Claude Code](https://github.com/anthropics/claude-code) | `autoresearch install claude-code` | `/autoresearch` |\n| [Codex CLI](https://github.com/openai/codex) | `autoresearch install codex` | `/autoresearch` |\n| [OpenCode](https://github.com/opencode-ai/opencode) | `autoresearch install opencode` | `/autoresearch` |\n| [Cursor](https://cursor.com) | `autoresearch install cursor` | auto-discovered |\n| [Windsurf](https://windsurf.com) | `autoresearch install windsurf` | auto-discovered |\n| [Gemini CLI](https://github.com/google-gemini/gemini-cli) | `autoresearch install gemini` | auto-discovered |\n| [GitHub Copilot](https://github.com/features/copilot) | `autoresearch install copilot` | auto-discovered |\n| Augment / Goose / Roo | `autoresearch install all` | auto-discovered |\n\nNo skill needed? Run `autoresearch guide` for the full methodology. The CLI coaches agents through hints in every response.\n\n### Full Command Reference\n\n| Command | What it does | Agent-facing |\n|---------|-------------|:---:|\n| `install \u003ctarget\u003e` | Install the autoresearch skill into an AI agent | |\n| `init` | Scaffold project (`autoresearch.toml` + `program.md`) | |\n| `doctor` | 14-point pre-flight check | * |\n| `record` | Record experiment result (JSONL, run numbering, deltas) | * |\n| `log` | Show experiment history with metrics and status | * |\n| `best` | Show best experiment + diff from baseline | * |\n| `diff \u003ca\u003e \u003cb\u003e` | Compare two experiments side-by-side | * |\n| `status` | Project state, best metric, loop progress | * |\n| `export` | Export as CSV, JSON, or JSONL | |\n| `fork \u003cnames...\u003e` | Branch into parallel experiment directions | |\n| `review` | Generate cross-model review prompt with pattern detection | * |\n| `watch` | Live terminal dashboard for real-time monitoring | |\n| `merge-best` | Compare fork branches and pick the winner | * |\n| `report` | Generate full markdown research report | |\n| `agent-info` | Machine-readable capability metadata | * |\n\nAll commands support `--json` for structured output. Auto-enabled when piped.\n\nAgent-facing commands (`*`) return consistent JSON envelopes with semantic exit codes (0=success, 1=runtime error, 2=config error) and actionable `suggestion` fields on errors.\n\n### Multi-Direction Experiment Exploration\n\nFork your experiments into parallel branches and let multiple agents explore different directions at the same time:\n\n```bash\n# Create parallel experiment branches\nautoresearch fork try-transformers try-convolutions try-linear\n\n# Start a separate agent on each\ngit checkout autoresearch-fork-try-transformers \u0026\u0026 /autoresearch\n\n# Compare results and pick the winner\nautoresearch merge-best\n```\n\n### Cross-Model Review for AI Experiments\n\nAfter running experiments, get a second model to review your progress:\n\n```bash\nautoresearch review --json | jq -r '.data.review_prompt'\n```\n\nGenerates a structured review prompt with session summaries, pattern detection (stuck detection, repeated failure themes), and suggested next directions. Pipe to Codex or Gemini for cross-model insights that break local minima.\n\n### Live Experiment Dashboard\n\nMonitor experiments in real time from another terminal:\n\n```bash\nautoresearch watch\n```\n\nShows a live-updating dashboard with sparkline progress, kept/discarded rates, best metric, and new experiment notifications. Refreshes every 2 seconds (configurable with `-i`).\n\n## Configuration\n\n`autoresearch.toml`:\n```toml\ntarget_file = \"train.py\"           # The single file the agent may modify\neval_command = \"python train.py\"   # Must print the metric to stdout\nmetric_name = \"val_bpb\"            # What the metric is called\nmetric_direction = \"lower\"         # \"lower\" or \"higher\"\ntime_budget = \"5m\"                 # Max time per experiment\nbranch = \"autoresearch\"            # Git branch for experiments\n```\n\n`program.md` is free-form -- tell the agent what to explore, link papers, set constraints. The agent reads it between experiments for inspiration.\n\n## What People Are Building\n\nThe autonomous research pattern works on **anything with a measurable metric**:\n\n| Domain | Metric | Result |\n|--------|--------|--------|\n| ML training | val_bpb | 126 experiments overnight, 11% improvement |\n| Chess engine | Elo rating | Expert to Grandmaster (2718 Elo) |\n| Bitcoin modeling | Prediction error | Halved error in one morning |\n| Sudoku solver | Accuracy | Beat published paper (87% to 92.2%) |\n| API latency | p99 ms | 37% reduction via KD-tree optimization |\n| Trading bots | Score | 43,000% improvement via evolutionary loop |\n\n## Contributing\n\nContributions are welcome. See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.\n\n## Inspired By\n\n- [Karpathy's autoresearch](https://github.com/karpathy/autoresearch) -- the original pattern\n- [uditgoenka/autoresearch](https://github.com/uditgoenka/autoresearch) -- generalized Claude Code skill\n- [ARIS](https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep) -- cross-model research pipeline\n- [ResearcherSkill](https://github.com/krzysztofdudek/ResearcherSkill) -- domain-agnostic research agent\n- [SkyPilot scaling](https://blog.skypilot.co/scaling-autoresearch/) -- multi-GPU parallel autoresearch\n\n## License\n\nMIT -- see [LICENSE](LICENSE).\n\n---\n\n\u003cdiv align=\"center\"\u003e\n\nBuilt by [Boris Djordjevic](https://github.com/longevityboris) at [199 Biotechnologies](https://github.com/199-biotechnologies) | [Paperfoot AI](https://paperfoot.ai)\n\n\u003cbr /\u003e\n\n**If this is useful to you:**\n\n[![Star this repo](https://img.shields.io/github/stars/paperfoot/autoresearch-cli?style=for-the-badge\u0026logo=github\u0026label=%E2%AD%90%20Star%20this%20repo\u0026color=yellow)](https://github.com/paperfoot/autoresearch-cli/stargazers)\n\u0026nbsp;\u0026nbsp;\n[![Follow @longevityboris](https://img.shields.io/badge/Follow_%40longevityboris-000000?style=for-the-badge\u0026logo=x\u0026logoColor=white)](https://x.com/longevityboris)\n\n\u003c/div\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpaperfoot%2Fautoresearch-cli","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpaperfoot%2Fautoresearch-cli","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpaperfoot%2Fautoresearch-cli/lists"}