https://github.com/axect/magi-researchers

Three AI models, one synthesis — Claude, Gemini & Codex cross-verify each other for rigorous multi-perspective research
https://github.com/axect/magi-researchers

ai-research claude-code claude-code-plugin codex cross-verification gemini mcp multi-model research-automation scientific-computing

Last synced: 2 days ago
JSON representation

Three AI models, one synthesis — Claude, Gemini & Codex cross-verify each other for rigorous multi-perspective research

Host: GitHub
URL: https://github.com/axect/magi-researchers
Owner: Axect
License: mit
Created: 2026-02-18T11:05:28.000Z (10 days ago)
Default Branch: main
Last Pushed: 2026-02-23T07:46:01.000Z (5 days ago)
Last Synced: 2026-02-23T15:56:36.622Z (5 days ago)
Topics: ai-research, claude-code, claude-code-plugin, codex, cross-verification, gemini, mcp, multi-model, research-automation, scientific-computing
Size: 257 KB
Stars: 9
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE

Awesome Lists containing this project

README

MAGI Architecture: Claude, Gemini, and Codex orchestrating cross-verified research through brainstorm, plan, implement, test, and report phases

MAGI Researchers

Three AI models. One synthesis. Zero lost progress.

Multi-model research pipeline for Claude Code — Claude, Gemini, and Codex debate, cross-verify, and synthesize publication-ready artifacts.

Get Started •
Why MAGI? •
Features •
Usage •
Roadmap •
Changelog

---

> *Like the MAGI system in Evangelion — three supercomputers cross-verifying each other — this plugin orchestrates Claude, Gemini, and Codex for rigorous, multi-perspective research.*

## Get Started

**Prerequisites:** [Claude Code](https://docs.anthropic.com/en/docs/claude-code) + Python 3.11+ with [uv](https://docs.astral.sh/uv/) + [Gemini CLI](https://github.com/google-gemini/gemini-cli) + [Codex CLI](https://github.com/openai/codex)

**1. Install the plugin** (inside Claude Code):
```
/plugin marketplace add Axect/magi-researchers
/plugin install magi-researchers@magi-researchers-marketplace
```

**2. Set up MCP servers** (one-time):
```bash
claude mcp add -s user gemini-cli -- npx -y gemini-mcp-tool
claude mcp add -s user codex-cli -- npx -y @cexll/codex-mcp-server
claude mcp add -s user context7 -- npx -y @upstash/context7-mcp@latest
```

**3. Run your first research:**
```
/magi-researchers:research "your research topic" --domain physics
```

MAGI generates cross-verified hypotheses, writes implementation code, renders publication-quality plots, and synthesizes a structured report — all saved to `outputs/{topic}/`.

Alternative: Local Development

```bash
git clone https://github.com/Axect/magi-researchers.git
claude --plugin-dir /path/to/magi-researchers
uv add matplotlib SciencePlots numpy
```

## Why MAGI?

Single-model research has blind spots. One model hallucinates a citation or misses a critical constraint — and nobody catches it.

| | **Single Model** | **MAGI (3 Models)** |
|:---|:---|:---|
| **Brainstorming** | One perspective | Three independent perspectives |
| **Verification** | Self-review (unreliable) | Cross-model peer review |
| **Blind spots** | Undetected | Caught by competing models |
| **Output** | Raw text | Structured report with consensus & divergence analysis |

- **Claude** — *The Scientist.* Synthesis, planning, implementation, report generation.
- **Gemini** — *The Critic.* Creative brainstorming, cross-verification, broad knowledge.
- **Codex** — *The Builder.* Feasibility analysis, code review, implementation focus.

## Features

### Research Pipeline

| Phase | What Happens | Output |
|:---|:---|:---|
| **1. Brainstorm** | Three models generate and cross-review ideas with expert personas | `brainstorm/` |
| **2. Plan** | Concrete research plan, stress-tested by a hostile reviewer | `plan/` |
| **3. Implement** | Claude writes code with live library doc lookups | `src/` |
| **4. Test & Visualize** | Collaborative test design + publication-quality plots | `tests/` + `plots/` |
| **5. Report** | Structured report with cross-verified claim-evidence integrity | `report.md` |

### Never Lose Your Work (v0.4.0)

Long research sessions crash. Context windows expire. Networks drop. Now you can pick up right where you left off:

- **`--resume`** — Interrupted mid-pipeline? Just pass `--resume ` and MAGI detects your progress from existing files. No manual bookkeeping, no fragile state files — your artifacts *are* the checkpoints.
- **Artifact contracts** — Before each phase, MAGI verifies that all required upstream files actually exist and aren't empty. Catches silent failures before they cascade into garbage outputs.
- **Standalone phase gates** — Every sub-skill (`/research-implement`, `/research-test`) now generates its own quality gate, even when run independently outside the main pipeline.

### Stress-Tested by Design (v0.3.0)

- **Weighted direction scoring** — Rank research ideas by novelty, feasibility, impact, rigor, and scalability with domain-tuned or custom weights
- **Dynamic persona casting** — Each model gets a topic-specific expert identity (e.g., *"Bayesian statistician with causal inference expertise"*), sharpening ideation
- **Adversarial debate** — At `--depth high`, models defend, concede, or revise on their top disagreements before synthesis
- **Murder board** — Gemini attacks the research plan as a hostile reviewer; Claude documents mitigations for every flaw found
- **Phase gates** — Automated quality checkpoints with conditional MAGI mini-review before each user approval step
- **Depth control** — `--depth low` for fast/cheap runs, `medium` (default) for standard review, `high` for full adversarial analysis

### Core Capabilities

- **Publication-quality plots** — `matplotlib` + `scienceplots` (Nature theme), PNG 300 dpi + vector PDF
- **MAGI traceability review** — All three models cross-verify the final report for orphaned claims and figures
- **Report gap detection** — Auto-generates missing visualizations from existing data
- **Domain templates** — Built-in context for Physics, AI/ML, Statistics, Mathematics, and Paper Writing
- **Journal strategy** — Venue recommendations for [Physics](docs/journal-strategies.md#particle-physics-phenomenology), [AI/ML](docs/journal-strategies.md#aiml-conferences--journals), and [Interdisciplinary](docs/journal-strategies.md#interdisciplinary-science-ml--natural-sciences) research
- **LaTeX math** — Proper inline and display equations across all outputs

Under the hood

- **Plot manifest** — Structured `plot_manifest.json` with metadata, section hints, and captions for automated report integration
- **Gemini fallback chain** — Resilient 3-tier model fallback: `gemini-3.1-pro-preview` → `gemini-3-pro-preview` → `gemini-2.5-pro`
- **Cross-phase artifact contracts** — Each phase validates incoming artifacts before running (tool-based Glob/Read, not LLM guesswork)
- **Depth-controlled token budget** — `--depth low` skips cross-review for fast/cheap runs; `--depth high` enables full adversarial debate

## Usage

| Command | Description |
|:---|:---|
| `/magi-researchers:research "topic"` | Full pipeline (all 5 phases) |
| `/magi-researchers:research-brainstorm "topic"` | Brainstorming with cross-verification |
| `/magi-researchers:research-implement` | Implementation (needs existing plan) |
| `/magi-researchers:research-test` | Testing & visualization |
| `/magi-researchers:research-report` | Report generation |

### Flags

| Flag | Values | Default | Description |
|:---|:---|:---|:---|
| `--domain` | `physics` `ai_ml` `statistics` `mathematics` `paper` | auto-inferred | Research domain for context and weight defaults |
| `--weights` | JSON object | domain default | Custom scoring weights (keys: `novelty`, `feasibility`, `impact`, `rigor`, `scalability`) |
| `--depth` | `low` `medium` `high` | `medium` | Review thoroughness — controls cross-review and adversarial debate |
| `--resume` | `` | — | Resume an interrupted pipeline from the last completed phase |

```bash
# Quick brainstorm with default settings
/magi-researchers:research "neural ODE solvers for stiff systems" --domain physics

# Deep analysis with custom weights and adversarial debate
/magi-researchers:research "causal inference in observational studies" --domain statistics --depth high --weights '{"novelty":0.2,"rigor":0.4,"feasibility":0.2,"impact":0.15,"scalability":0.05}'

# Resume a crashed session — MAGI picks up where you left off
/magi-researchers:research "neural ODE solvers" --resume outputs/neural_ode_solvers_20260225_v1

# Fast ideation only (no cross-review, lowest cost)
/magi-researchers:research-brainstorm "transformer alternatives for long sequences" --domain ai_ml --depth low
```

> If MAGI saves you research time, consider leaving a [star](https://github.com/Axect/magi-researchers/stargazers) so other researchers can find it.

### Output Structure

```
outputs/{topic_YYYYMMDD_vN}/
├── brainstorm/ # Personas, ideas, cross-reviews, debate, synthesis
├── plan/ # Research plan, murder board, mitigations, phase gate
├── src/ # Implementation + phase gate
├── tests/ # Test suite + phase gate
├── plots/ # PNG + PDF + plot_manifest.json
└── report.md # Final structured report
```

Each phase produces artifacts that double as resume checkpoints — just pass `--resume` to continue from where you left off.

Full artifact tree

```
brainstorm/
├── weights.json # Scoring weights
├── personas.md # Expert personas
├── gemini_ideas.md # Gemini brainstorm
├── codex_ideas.md # Codex brainstorm
├── gemini_review_of_codex.md # Cross-review (depth ≥ medium)
├── codex_review_of_gemini.md # Cross-review (depth ≥ medium)
├── debate_round2_gemini.md # Adversarial debate (depth = high)
├── debate_round2_codex.md # Adversarial debate (depth = high)
└── synthesis.md # Weighted synthesis

plan/
├── research_plan.md # Research plan
├── murder_board.md # Plan stress-test
├── mitigations.md # Flaw mitigations
└── phase_gate.md # Plan quality gate

src/
├── *.py # Research implementation
└── phase_gate.md # Implementation quality gate

tests/
├── test_*.py # Test suite
└── phase_gate.md # Test quality gate
```

Recommended Permissions

Add to `.claude/settings.local.json`:

```json
{
"permissions": {
"allow": [
"Bash(uv:*)",
"Bash(uv run:*)",
"Bash(uv run python3:*)",
"Bash(uv add:*)",
"Bash(uv sync:*)",
"Bash(mkdir:*)",
"mcp__gemini-cli__ask-gemini",
"mcp__gemini-cli__brainstorm",
"mcp__codex-cli__ask-codex",
"mcp__codex-cli__brainstorm",
"mcp__plugin_context7_context7__resolve-library-id",
"mcp__plugin_context7_context7__query-docs"
]
}
}
```

## Roadmap

**Shipped:** Multi-model brainstorming & cross-verification • Domain & journal strategy templates • Plot manifest & gap detection • MAGI traceability review • LaTeX math & Gemini fallback chain • Weighted scoring & dynamic personas • Adversarial debate • Murder board & phase gates • Depth-controlled token budget • Session resume (`--resume`) • Artifact contract validation • Standalone phase gates for all sub-skills

**Up next:**
- [ ] Example artifact gallery — real research outputs to showcase the pipeline
- [ ] Terminal demo GIF — one-command walkthrough
- [ ] More domain & journal strategy templates
- [ ] Ubiquitous Context7 — live doc lookups during testing and report writing, not just implementation
- [ ] Conditional variance enforcement — smart error-bar policy that catches missing uncertainty without blocking exploratory runs
- [ ] Cost estimation — token budget preview before execution

## Contributing

Contributions welcome — especially new domain templates. See [CONTRIBUTING.md](CONTRIBUTING.md).

## License

[MIT](LICENSE)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/axect/magi-researchers

Awesome Lists containing this project

README

MAGI Researchers