{"id":49714403,"url":"https://github.com/sebastianelvis/reaper","last_synced_at":"2026-05-08T19:05:00.162Z","repository":{"id":350625411,"uuid":"1207527984","full_name":"SebastianElvis/reaper","owner":"SebastianElvis","description":"AI-native scientific research pipeline — Claude Code plugin that takes a paper + research goal and autonomously runs multi-step academic research","archived":false,"fork":false,"pushed_at":"2026-05-02T04:35:15.000Z","size":385,"stargazers_count":5,"open_issues_count":0,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-02T05:06:41.636Z","etag":null,"topics":["ai-research","arxiv","claude-code","claude-code-plugin","literature-review","research","scientific-research"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SebastianElvis.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-11T03:46:28.000Z","updated_at":"2026-05-02T04:35:20.000Z","dependencies_parsed_at":"2026-05-02T05:02:53.480Z","dependency_job_id":null,"html_url":"https://github.com/SebastianElvis/reaper","commit_stats":null,"previous_names":["sebastianelvis/reaper"],"tags_count":14,"template":false,"template_full_name":null,"purl":"pkg:github/SebastianElvis/reaper","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SebastianElvis%2Freaper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SebastianElvis%2Freaper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SebastianElvis%2Freaper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SebastianElvis%2Freaper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SebastianElvis","download_url":"https://codeload.github.com/SebastianElvis/reaper/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SebastianElvis%2Freaper/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32793502,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-08T08:22:46.396Z","status":"ssl_error","status_checked_at":"2026-05-08T08:22:45.650Z","response_time":54,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-research","arxiv","claude-code","claude-code-plugin","literature-review","research","scientific-research"],"created_at":"2026-05-08T19:04:53.265Z","updated_at":"2026-05-08T19:05:00.151Z","avatar_url":"https://github.com/SebastianElvis.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Reaper\n\n**Reaper (REAd PapER)** — an AI-native scientific research pipeline. A composable set of [AI agent skills](https://github.com/vercel-labs/skills) that takes a research goal — optionally with a research paper — and autonomously conducts rigorous, multi-step academic research. Runs on any agent that supports the `SKILL.md` convention (Cursor, Codex CLI, Cline, Continue, Gemini CLI, Copilot, Windsurf, Claude Code, and 40+ more).\n\n[![Skills](https://img.shields.io/badge/skills-SKILL.md-brightgreen)](https://github.com/vercel-labs/skills)\n\n## What Reaper Does\n\nGive Reaper a research question — with or without a PDF. It reads the paper (if provided), searches for related work, formalizes hypotheses, investigates them in parallel, critiques its own findings, and delivers a structured research report — all without manual prompting between steps.\n\n```\n# Without a paper — pure goal-driven research\n/reaper \"explore the feasibility of post-quantum threshold signatures\"\n\n# With a paper\n/reaper \"determine if the security proof in Section 4 holds under asynchrony\" path/to/paper.pdf\n```\n\nHow you invoke a skill depends on the host agent. The `/\u003cskill\u003e` form above is the canonical display convention used throughout these docs — it works directly on slash-command hosts (e.g. Claude Code), and on auto-discovery hosts (Cursor, Codex CLI, Cline, Continue, Gemini CLI, Copilot, Windsurf, …) you simply ask the agent to \"run the `/reaper` skill on …\" by its bare name.\n\n**Key capabilities:**\n\n- **Autonomous multi-stage pipeline** — goal clarification, paper analysis, literature review, hypothesis formalization, parallel investigation, critique, and synthesis all chain automatically\n- **Parallel investigation with keep-or-discard discipline** — multiple hypotheses are investigated concurrently; only genuine progress advances the working state, while dead ends stay logged\n- **Built-in academic search** — paper search, PDF download, citation graph tracing, and venue resolution across arXiv, IACR ePrint, Semantic Scholar, DBLP, and OpenAlex\n- **Domain-agnostic design** — ships with cryptography and distributed systems references, but swap the reference files to adapt to any research domain\n- **Multi-model AI consultation** — optionally consult Codex, Gemini, DeepSeek, or local models for a second opinion at every pipeline stage\n- **Composable skills** — each pipeline stage is an independent skill you can run standalone\n- **Host-agnostic** — distributed as plain `SKILL.md` folders that work across 45+ AI coding agents\n\n## How It Works\n\nReaper executes a multi-stage pipeline where investigation runs in parallel batches and critique provides feedback from multiple sources:\n\n```\n                     ┌── /analyze-paper (if paper) ──┐\n/clarify-goal ─────\u003e │                               ├─\u003e /formalize-problem\n                     └── /review-literature ─────────┘          │\n                           │ (parallel)                         v\n                           │                    ┌──────────\u003e /brainstorm\n                           └── calls            │                │\n                               /analyze-paper   │                │\n                               per downloaded   │    ┌─ /investigate ─────────────────┐\n                               paper            │    │  plan batch                    │\n                                                │    │    ├──\u003e agent H1 ─┐            │\n                                                │    │    ├──\u003e agent H2 ─┼──\u003e merge   │\n                                                │    │    └──\u003e agent H3 ─┘     │      │\n                                                │    │          next batch or done    │\n                                                │    └────────────────────────────────┘\n                                                │                    │\n                                                │    ┌─ /critique ────────────────────┐\n                                                │    │  --self  --codex  \"feedback\"   │\n                                                │    └────┬───────────────────┬───────┘\n                                                │         │                   │\n                                                │   deepen/explore    rewrite/done ──\u003e /synthesize ──\u003e report.md\n                                                └─────────┘\n```\n\n## Skills\n\nEach skill can be used independently or composed by the orchestrator. Invoke by skill `name` using your host's native skill-loading mechanism.\n\n| Skill | What it does |\n|-------|-------------|\n| `/reaper` | Full pipeline: clarify → analyze → literature → formalize → brainstorm → investigate ↔ critique → synthesize |\n| `/clarify-goal` | Ask targeted clarifying questions to sharpen a vague research goal |\n| `/analyze-paper` | Extract structured information from a research paper |\n| `/review-literature` | Search and summarize related academic work |\n| `/formalize-problem` | Produce precise, testable hypotheses from a research question |\n| `/brainstorm` | Generate, prioritize, and refine research ideas based on current state |\n| `/investigate` | Run investigation cycles with keep-or-discard discipline |\n| `/critique` | Provide critique via human feedback, Codex consultation, or self-review (can trigger more investigation) |\n| `/synthesize` | Generate a structured research report from investigation results |\n| `/search-paper` | Find papers, download PDFs, trace citation graphs, and resolve publication venues across arXiv, IACR ePrint, Semantic Scholar, DBLP, and OpenAlex |\n\n\u003e The `/\u003cskill\u003e` form is the canonical display convention used throughout these docs. Slash-command hosts (Claude Code) invoke them directly that way (e.g. `/clarify-goal`). Auto-discovery hosts (Cursor, Codex CLI, Cline, Continue, Gemini CLI, Copilot, Windsurf, …) invoke them by the bare skill name — drop the leading `/` when asking the agent to run a skill.\n\n## Installation\n\n### Prerequisites\n\nThe search skills require Python packages:\n\n```bash\npip install arxiv requests beautifulsoup4\n```\n\n\u003e **Note**: `npx skills` only copies `SKILL.md` files, Python scripts, and reference files into your agent's skills folder. It does **not** install Python dependencies, register MCP servers, or create the `reaper-workspace/` directory. Install Python deps yourself with the command above; register the Codex MCP server separately if you want `--codex` (see [Optional: Multi-model AI consultation](#optional-multi-model-ai-consultation) below); the workspace directory is created automatically the first time the pipeline runs.\n\n### Install via `npx skills` (recommended — works on 45+ agents)\n\nReaper is distributed as standard `SKILL.md` folders. The cross-agent installer [`vercel-labs/skills`](https://github.com/vercel-labs/skills) shallow-clones this repository and copies all 10 skill directories — including Python scripts and reference files — into your agent's conventional skills folder.\n\n```bash\n# Latest from the default branch\nnpx skills add SebastianElvis/reaper\n\n# Pinned to a specific release (recommended for reproducibility)\nnpx skills add SebastianElvis/reaper#v0.4.0\n\n# Install into a specific agent (defaults to all detected)\nnpx skills add SebastianElvis/reaper --agent cursor\n```\n\nSupported targets include Cursor, OpenAI Codex CLI, Cline, Continue, Gemini CLI, GitHub Copilot, Windsurf, OpenCode, Warp, Goose, Replit, Claude Code, and a `universal` target at `.agents/skills/`. See `npx skills list-agents` for the full list.\n\n\u003e **Reminder**: `npx skills add` copies files only. Python deps and MCP server registration are separate steps — see Prerequisites above and [Optional: Multi-model AI consultation](#optional-multi-model-ai-consultation) below.\n\n### Install on Claude Code (as a plugin)\n\nClaude Code can also consume Reaper via its native plugin marketplace mechanism, which bundles the same skills with slash-command routing:\n\n```\n/plugin marketplace add SebastianElvis/reaper\n/plugin install reaper@SebastianElvis-reaper\n```\n\nOr clone and add as a local marketplace:\n\n```bash\ngit clone https://github.com/SebastianElvis/reaper.git\n/plugin marketplace add ./reaper\n/plugin install reaper@reaper\n```\n\nSee the [Claude Code plugin docs](https://code.claude.com/docs/en/discover-plugins) for more details.\n\n### Invocation across hosts\n\n- **Slash-command hosts** (Claude Code): `/reaper \"\u003cgoal\u003e\"`, `/analyze-paper \u003cpath\u003e`, etc. Each skill is available as a top-level slash command.\n- **Auto-discovery hosts** (Cursor, Codex CLI, Cline, Continue, Gemini CLI, Copilot, Windsurf, …): the agent loads `SKILL.md` files from its skills folder and invokes them by name when the task matches the skill's `description`. Ask the agent to run the skill, e.g. *\"use the reaper skill to research X\"*.\n- **Manual invocation**: any host can be pointed at a specific `SKILL.md` if its native discovery doesn't pick it up.\n\nA few skill features are host-specific:\n\n- The `--codex` flag enables external-model consultation via MCP. It currently requires a host with MCP support (Claude Code, OpenCode, etc.) and silently falls back to self-review elsewhere.\n- Frontmatter keys `user-invocable`, `argument-hint`, hooks, and `context: fork` are Claude-Code-specific. They are preserved in the SKILL.md files but no-op on other hosts.\n\n### Optional: Multi-model AI consultation\n\nPass `--codex` to enable pipeline-wide AI consultation — every skill gains a checkpoint where it can consult an external model for a second opinion. The orchestrator controls when consultations happen and routes to the best-suited model (see `skills/reaper/references/codex-consultation.md` for the protocol).\n\n**Supported backends:**\n\n| Model | Setup | Strength |\n|-------|-------|----------|\n| OpenAI Codex/o3 | Register `codex-mcp-server` in your host's MCP config | Adversarial review, stress-testing arguments |\n| Google Gemini | *(coming soon)* | Long-context review across full paper corpora |\n| DeepSeek R1 | *(coming soon)* | Proof checking, formal reasoning |\n| Local models | *(coming soon — via ollama)* | Offline/private use, cost control |\n\nExample registration on Claude Code:\n\n```bash\nclaude mcp add codex-cli -- npx -y codex-mcp-server\n```\n\nOther MCP-capable hosts use their own equivalent registration. If no model backends are configured, AI consultation is silently skipped and the pipeline continues with self-review only.\n\n## Workspace\n\nWhen Reaper runs, it creates a `reaper-workspace/` directory:\n\n```\nreaper-workspace/\n├── notes/                          # Evolving — edited inline to reflect latest state\n│   ├── clarified-goal.md           # Refined goal, scope, assumptions, Q\u0026A\n│   ├── paper-summary.md            # Structured paper extraction\n│   ├── literature.md               # Related work survey\n│   ├── problem-statement.md        # Formalized problem (model, properties, metrics)\n│   ├── ideas.md                    # Research ideas/hypotheses (edited inline on revisit)\n│   ├── current-understanding.md    # \"Branch tip\" — advances only on keep\n│   └── results.md                  # One row per hypothesis, updated inline on revisit\n├── investigations/                 # Evolving — reuse directory on revisit, edit inline\n│   └── NNN-\u003cname\u003e/                 # One directory per hypothesis\n│       ├── analysis.md             # Reasoning, attempts, dead ends, insights\n│       └── proof.md                # Formal proofs (theorems, lemmas, corollaries)\n├── feedbacks/                      # Append-only — one file per event, never modified\n│   ├── round-N.md                  # Human feedback classified by type\n│   └── codex-consultation-N.md     # Codex critique (alternates devil's advocate / inspiration)\n├── logs/                           # Append-only — one file per cycle, never modified\n│   └── cycle-NNN-\u003cslug\u003e.md         # One log per investigation cycle (snapshot at cycle end)\n└── report.md                       # Final synthesized output\n```\n\nThe workspace contract is host-agnostic — any agent that can read and write files in the working directory produces the same workspace structure.\n\n## Evaluation\n\nSkills ship with a layered evaluation system following Anthropic's [*Demystifying Evals for AI Agents*](https://www.anthropic.com/engineering/demystifying-evals-for-ai-agents) methodology. The judge is the local `claude` CLI — no API key, just your existing subscription.\n\n| Layer | Grader | Cadence | Scope |\n|---|---|---|---|\n| L1 Structural | Code (`evals/graders/`) | Every PR (CI) | Required sections, lengths, broken refs, keep-or-discard cycle invariant |\n| L2 Skill rubric | `claude -p` with structured-output JSON schema (`evals/judge/`) | Locally / nightly | Per-skill quality dimensions: groundedness, specificity, completeness |\n| L3 End-to-end | Both | Pre-release | Full `/reaper` pipeline against canonical cases |\n\n```bash\n# L1 only (no LLM) — same thing CI runs\npython3 -m evals.run_evals --layer structural\n\n# L1 + L2 (uses your local claude CLI)\npython3 -m evals.run_evals --layer all --skill analyze-paper\n```\n\nEach fixture pairs a gold-standard reference with planted negatives — one targeting L1 (drops a required section) and one targeting L2 (fabricated theorem statements, generic content) — so a permissive grader fails CI as visibly as a missed regression. See [`evals/README.md`](evals/README.md) for the full design and how to add a fixture.\n\n## Methodology\n\nReaper's research loop follows six principles:\n\n1. **Separation of Concerns** — AI writes to workspace, human provides the goal, skill definitions are fixed\n2. **Fixed Evaluation Signal** — Clarify the goal, establish baseline via paper analysis and literature review, then formalize into precise hypotheses with trust assumptions, security properties, and impossibility screening\n3. **Structured Results Log** — Every hypothesis gets one row in `notes/results.md` with action, outcome, confidence, and keep/discard status; revisits update the row inline rather than appending duplicates\n4. **Keep-or-Discard Loop** — `current-understanding.md` only advances on genuine progress; dead ends stay logged but don't pollute working state\n5. **Never Stop** — Run all cycles without asking permission to continue; if stuck, re-read the paper, question assumptions, combine discarded results, search for more related work, or try a radically different approach\n6. **Clarity and Simplicity** — One \"ping\" per finding, refutable claims, fewer assumptions = better; write early to crystallize understanding, not just to report it\n\nSee [`dev/ROADMAP.md`](dev/ROADMAP.md) for the full methodology and development roadmap.\n\n## Development Status\n\nSee [`dev/ROADMAP.md`](dev/ROADMAP.md) for the full roadmap.\n\n- **Horizon 1 (The Pipeline)**: Core skills, orchestrator, and layered eval system (L1 structural graders + L2 Claude-CLI judges with rubrics, calibrated against planted negatives) — *complete; LaTeX report output and broader rubric coverage across all skills planned*\n- **Horizon 2 (The Library)**: arXiv/ePrint search via Python scripts + citation graph + venue resolution (Semantic Scholar / DBLP / OpenAlex) — *complete*\n- **Horizon 3 (The Committee)**: Multi-model critique via the `/critique` skill's `--codex` mode — *Codex complete, Gemini/DeepSeek/local planned*\n- **Horizon 3.5 (The Polyglot)**: Cross-agent distribution via `npx skills` and host-agnostic skill prose — *complete; per-host orchestration polish ongoing*\n- **Horizon 4 (The Academy)**: Broader topic search (Scholar/DBLP), author-centric and venue-centric search — *planned*\n- **Horizon 5 (The Apprentice)**: Evidence quality taxonomy, evidence-aware critique — *planned*\n- **Horizon 6 (The Examiner)**: Proactive reformulation trigger, claim provenance, formal verification — *planned*\n\n## Acknowledgements\n\nReaper's methodology draws from the following sources:\n\n- **[karpathy/autoresearch](https://github.com/karpathy/autoresearch)** — Loop discipline: constrain the loop so the AI can iterate autonomously with a clear signal of progress. The structured results log, keep-or-discard mechanism, and never-stop policy are direct adaptations.\n- **[Richard Hamming, \"You and Your Research\"](https://d37ugbyn3rpeym.cloudfront.net/stripe-press/TAODSAE_zine_press.pdf)** — The importance filter (\"Why are you not working on the important problems?\") and the technique of inverting blockages into insights.\n- **[Zhiyun Qian, \"How to Look for Ideas in Computer Science Research\"](https://medium.com/digital-diplomacy/how-to-look-for-ideas-in-computer-science-research-7a3fa6f4696f)** — Systematic idea generation patterns (fill-in-the-blank, start-small-then-generalize, build-a-hammer) for formulating research ideas.\n- **[Simon Peyton Jones, \"How to Write a Great Research Paper\"](https://www.microsoft.com/en-us/research/wp-content/uploads/2016/07/How-to-write-a-great-research-paper.pdf)** — Writing as a primary mechanism for doing research, not just reporting it. Shapes the report structure: one clear \"ping,\" explicit refutable contributions, examples before generality.\n- **[S. Keshav, \"How to Read a Paper\" (ACM SIGCOMM CCR, 2007)](http://ccr.sigcomm.org/online/files/p83-keshavA.pdf)** — The three-pass method for reading papers: first pass for the big picture, second pass to grasp content without details, third pass to virtually re-derive the work and challenge every assumption. Structures how the literature review skill reads downloaded papers at increasing depth.\n- **[Mathew Stiller-Reeve, \"How to Write a Thorough Peer Review\" (Nature, 2018)](https://www.nature.com/articles/d41586-018-06991-0)** — Three-reading review method (aims → scientific substance → presentation) and the mirror technique. Structures the per-paper notes in the literature review skill: mirror the paper's claims, classify issues as major/minor/fatal, evaluate whether conclusions answer the introduction's questions.\n- **[`vercel-labs/skills`](https://github.com/vercel-labs/skills)** — The cross-agent skills convention and CLI installer that makes Reaper portable across 45+ AI coding agents.\n\n## License\n\nThis project is open source. Licensed under the [Apache License 2.0](LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsebastianelvis%2Freaper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsebastianelvis%2Freaper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsebastianelvis%2Freaper/lists"}