https://github.com/axect/magi-researchers
Three AI models, one synthesis — Claude, Gemini & Codex cross-verify each other for rigorous multi-perspective research
https://github.com/axect/magi-researchers
ai-research claude-code claude-code-plugin codex cross-verification gemini mcp multi-model research-automation scientific-computing
Last synced: 2 days ago
JSON representation
Three AI models, one synthesis — Claude, Gemini & Codex cross-verify each other for rigorous multi-perspective research
- Host: GitHub
- URL: https://github.com/axect/magi-researchers
- Owner: Axect
- License: mit
- Created: 2026-02-18T11:05:28.000Z (10 days ago)
- Default Branch: main
- Last Pushed: 2026-02-23T07:46:01.000Z (5 days ago)
- Last Synced: 2026-02-23T15:56:36.622Z (5 days ago)
- Topics: ai-research, claude-code, claude-code-plugin, codex, cross-verification, gemini, mcp, multi-model, research-automation, scientific-computing
- Size: 257 KB
- Stars: 9
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
MAGI Researchers
Three AI models. One synthesis. Zero lost progress.
Multi-model research pipeline for Claude Code — Claude, Gemini, and Codex debate, cross-verify, and synthesize publication-ready artifacts.
Get Started •
Why MAGI? •
Features •
Usage •
Roadmap •
Changelog
---
> *Like the MAGI system in Evangelion — three supercomputers cross-verifying each other — this plugin orchestrates Claude, Gemini, and Codex for rigorous, multi-perspective research.*
## Get Started
**Prerequisites:** [Claude Code](https://docs.anthropic.com/en/docs/claude-code) + Python 3.11+ with [uv](https://docs.astral.sh/uv/) + [Gemini CLI](https://github.com/google-gemini/gemini-cli) + [Codex CLI](https://github.com/openai/codex)
**1. Install the plugin** (inside Claude Code):
```
/plugin marketplace add Axect/magi-researchers
/plugin install magi-researchers@magi-researchers-marketplace
```
**2. Set up MCP servers** (one-time):
```bash
claude mcp add -s user gemini-cli -- npx -y gemini-mcp-tool
claude mcp add -s user codex-cli -- npx -y @cexll/codex-mcp-server
claude mcp add -s user context7 -- npx -y @upstash/context7-mcp@latest
```
**3. Run your first research:**
```
/magi-researchers:research "your research topic" --domain physics
```
MAGI generates cross-verified hypotheses, writes implementation code, renders publication-quality plots, and synthesizes a structured report — all saved to `outputs/{topic}/`.
Alternative: Local Development
```bash
git clone https://github.com/Axect/magi-researchers.git
claude --plugin-dir /path/to/magi-researchers
uv add matplotlib SciencePlots numpy
```
## Why MAGI?
Single-model research has blind spots. One model hallucinates a citation or misses a critical constraint — and nobody catches it.
| | **Single Model** | **MAGI (3 Models)** |
|:---|:---|:---|
| **Brainstorming** | One perspective | Three independent perspectives |
| **Verification** | Self-review (unreliable) | Cross-model peer review |
| **Blind spots** | Undetected | Caught by competing models |
| **Output** | Raw text | Structured report with consensus & divergence analysis |
- **Claude** — *The Scientist.* Synthesis, planning, implementation, report generation.
- **Gemini** — *The Critic.* Creative brainstorming, cross-verification, broad knowledge.
- **Codex** — *The Builder.* Feasibility analysis, code review, implementation focus.
## Features
### Research Pipeline
| Phase | What Happens | Output |
|:---|:---|:---|
| **1. Brainstorm** | Three models generate and cross-review ideas with expert personas | `brainstorm/` |
| **2. Plan** | Concrete research plan, stress-tested by a hostile reviewer | `plan/` |
| **3. Implement** | Claude writes code with live library doc lookups | `src/` |
| **4. Test & Visualize** | Collaborative test design + publication-quality plots | `tests/` + `plots/` |
| **5. Report** | Structured report with cross-verified claim-evidence integrity | `report.md` |
### Never Lose Your Work (v0.4.0)
Long research sessions crash. Context windows expire. Networks drop. Now you can pick up right where you left off:
- **`--resume`** — Interrupted mid-pipeline? Just pass `--resume ` and MAGI detects your progress from existing files. No manual bookkeeping, no fragile state files — your artifacts *are* the checkpoints.
- **Artifact contracts** — Before each phase, MAGI verifies that all required upstream files actually exist and aren't empty. Catches silent failures before they cascade into garbage outputs.
- **Standalone phase gates** — Every sub-skill (`/research-implement`, `/research-test`) now generates its own quality gate, even when run independently outside the main pipeline.
### Stress-Tested by Design (v0.3.0)
- **Weighted direction scoring** — Rank research ideas by novelty, feasibility, impact, rigor, and scalability with domain-tuned or custom weights
- **Dynamic persona casting** — Each model gets a topic-specific expert identity (e.g., *"Bayesian statistician with causal inference expertise"*), sharpening ideation
- **Adversarial debate** — At `--depth high`, models defend, concede, or revise on their top disagreements before synthesis
- **Murder board** — Gemini attacks the research plan as a hostile reviewer; Claude documents mitigations for every flaw found
- **Phase gates** — Automated quality checkpoints with conditional MAGI mini-review before each user approval step
- **Depth control** — `--depth low` for fast/cheap runs, `medium` (default) for standard review, `high` for full adversarial analysis
### Core Capabilities
- **Publication-quality plots** — `matplotlib` + `scienceplots` (Nature theme), PNG 300 dpi + vector PDF
- **MAGI traceability review** — All three models cross-verify the final report for orphaned claims and figures
- **Report gap detection** — Auto-generates missing visualizations from existing data
- **Domain templates** — Built-in context for Physics, AI/ML, Statistics, Mathematics, and Paper Writing
- **Journal strategy** — Venue recommendations for [Physics](docs/journal-strategies.md#particle-physics-phenomenology), [AI/ML](docs/journal-strategies.md#aiml-conferences--journals), and [Interdisciplinary](docs/journal-strategies.md#interdisciplinary-science-ml--natural-sciences) research
- **LaTeX math** — Proper inline and display equations across all outputs
Under the hood
- **Plot manifest** — Structured `plot_manifest.json` with metadata, section hints, and captions for automated report integration
- **Gemini fallback chain** — Resilient 3-tier model fallback: `gemini-3.1-pro-preview` → `gemini-3-pro-preview` → `gemini-2.5-pro`
- **Cross-phase artifact contracts** — Each phase validates incoming artifacts before running (tool-based Glob/Read, not LLM guesswork)
- **Depth-controlled token budget** — `--depth low` skips cross-review for fast/cheap runs; `--depth high` enables full adversarial debate
## Usage
| Command | Description |
|:---|:---|
| `/magi-researchers:research "topic"` | Full pipeline (all 5 phases) |
| `/magi-researchers:research-brainstorm "topic"` | Brainstorming with cross-verification |
| `/magi-researchers:research-implement` | Implementation (needs existing plan) |
| `/magi-researchers:research-test` | Testing & visualization |
| `/magi-researchers:research-report` | Report generation |
### Flags
| Flag | Values | Default | Description |
|:---|:---|:---|:---|
| `--domain` | `physics` `ai_ml` `statistics` `mathematics` `paper` | auto-inferred | Research domain for context and weight defaults |
| `--weights` | JSON object | domain default | Custom scoring weights (keys: `novelty`, `feasibility`, `impact`, `rigor`, `scalability`) |
| `--depth` | `low` `medium` `high` | `medium` | Review thoroughness — controls cross-review and adversarial debate |
| `--resume` | `` | — | Resume an interrupted pipeline from the last completed phase |
```bash
# Quick brainstorm with default settings
/magi-researchers:research "neural ODE solvers for stiff systems" --domain physics
# Deep analysis with custom weights and adversarial debate
/magi-researchers:research "causal inference in observational studies" --domain statistics --depth high --weights '{"novelty":0.2,"rigor":0.4,"feasibility":0.2,"impact":0.15,"scalability":0.05}'
# Resume a crashed session — MAGI picks up where you left off
/magi-researchers:research "neural ODE solvers" --resume outputs/neural_ode_solvers_20260225_v1
# Fast ideation only (no cross-review, lowest cost)
/magi-researchers:research-brainstorm "transformer alternatives for long sequences" --domain ai_ml --depth low
```
> If MAGI saves you research time, consider leaving a [star](https://github.com/Axect/magi-researchers/stargazers) so other researchers can find it.
### Output Structure
```
outputs/{topic_YYYYMMDD_vN}/
├── brainstorm/ # Personas, ideas, cross-reviews, debate, synthesis
├── plan/ # Research plan, murder board, mitigations, phase gate
├── src/ # Implementation + phase gate
├── tests/ # Test suite + phase gate
├── plots/ # PNG + PDF + plot_manifest.json
└── report.md # Final structured report
```
Each phase produces artifacts that double as resume checkpoints — just pass `--resume` to continue from where you left off.
Full artifact tree
```
brainstorm/
├── weights.json # Scoring weights
├── personas.md # Expert personas
├── gemini_ideas.md # Gemini brainstorm
├── codex_ideas.md # Codex brainstorm
├── gemini_review_of_codex.md # Cross-review (depth ≥ medium)
├── codex_review_of_gemini.md # Cross-review (depth ≥ medium)
├── debate_round2_gemini.md # Adversarial debate (depth = high)
├── debate_round2_codex.md # Adversarial debate (depth = high)
└── synthesis.md # Weighted synthesis
plan/
├── research_plan.md # Research plan
├── murder_board.md # Plan stress-test
├── mitigations.md # Flaw mitigations
└── phase_gate.md # Plan quality gate
src/
├── *.py # Research implementation
└── phase_gate.md # Implementation quality gate
tests/
├── test_*.py # Test suite
└── phase_gate.md # Test quality gate
```
Recommended Permissions
Add to `.claude/settings.local.json`:
```json
{
"permissions": {
"allow": [
"Bash(uv:*)",
"Bash(uv run:*)",
"Bash(uv run python3:*)",
"Bash(uv add:*)",
"Bash(uv sync:*)",
"Bash(mkdir:*)",
"mcp__gemini-cli__ask-gemini",
"mcp__gemini-cli__brainstorm",
"mcp__codex-cli__ask-codex",
"mcp__codex-cli__brainstorm",
"mcp__plugin_context7_context7__resolve-library-id",
"mcp__plugin_context7_context7__query-docs"
]
}
}
```
## Roadmap
**Shipped:** Multi-model brainstorming & cross-verification • Domain & journal strategy templates • Plot manifest & gap detection • MAGI traceability review • LaTeX math & Gemini fallback chain • Weighted scoring & dynamic personas • Adversarial debate • Murder board & phase gates • Depth-controlled token budget • Session resume (`--resume`) • Artifact contract validation • Standalone phase gates for all sub-skills
**Up next:**
- [ ] Example artifact gallery — real research outputs to showcase the pipeline
- [ ] Terminal demo GIF — one-command walkthrough
- [ ] More domain & journal strategy templates
- [ ] Ubiquitous Context7 — live doc lookups during testing and report writing, not just implementation
- [ ] Conditional variance enforcement — smart error-bar policy that catches missing uncertainty without blocking exploratory runs
- [ ] Cost estimation — token budget preview before execution
## Contributing
Contributions welcome — especially new domain templates. See [CONTRIBUTING.md](CONTRIBUTING.md).
## License
[MIT](LICENSE)