https://github.com/andre-inter-collab-llc/research-workflow-assistant

Open-source AI research assistant for VS Code + GitHub Copilot. Connects to PubMed, OpenAlex, Semantic Scholar, Europe PMC, CrossRef, and Zotero via MCP servers. Custom agents guide systematic reviews, academic writing, data analysis, and project management — all ICMJE-compliant with full audit trails.
https://github.com/andre-inter-collab-llc/research-workflow-assistant
academic-writing crossref europe-pmc github-copilot icmje literature-review mcp-server meta-analysis model-context-protocol openalex prisma pubmed quarto reference-management reproducible-research research-assistant semantic-scholar systematic-review vscode zotero
Last synced: 2 months ago
JSON representation
Host: GitHub
URL: https://github.com/andre-inter-collab-llc/research-workflow-assistant
Owner: andre-inter-collab-llc
License: mit
Created: 2026-03-08T14:23:20.000Z (3 months ago)
Default Branch: master
Last Pushed: 2026-04-01T18:42:16.000Z (3 months ago)
Last Synced: 2026-04-03T05:09:38.755Z (2 months ago)
Topics: academic-writing, crossref, europe-pmc, github-copilot, icmje, literature-review, mcp-server, meta-analysis, model-context-protocol, openalex, prisma, pubmed, quarto, reference-management, reproducible-research, research-assistant, semantic-scholar, systematic-review, vscode, zotero
Language: HTML
Homepage:
Size: 5.82 MB
Stars: 8
Watchers: 0
Forks: 3
Open Issues: 0
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Security: SECURITY.md
Awesome Lists containing this project

awesome-medical-ai-skills - research-workflow-assistant - inter-collab-llc/research-workflow-assistant?style=flat-square) | Open-source AI research assistant for VS Code + GitHub Copilot. Connects to PubMed, OpenAlex, Semantic Scholar, Europe PMC, CrossRef, and Zotero via MCP servers. Custom agents guide systematic revi... | (Medical MCP Servers)
README

          # Research Workflow Assistant

An open-source, modular AI research assistant that runs inside **[VS Code](https://code.visualstudio.com/) + [GitHub Copilot](https://github.com/features/copilot)**. It connects to academic databases via MCP (Model Context Protocol) servers and encodes research best practices through custom Copilot agents. Built for reproducibility, ICMJE compliance, and human-centered research.

All RWA outputs — manuscripts, protocols, reports, analysis scripts, dashboards, and progress briefs — use **[Quarto](https://quarto.org/)** (by [Posit](https://posit.co/)) as the default document format. Quarto supports R and Python code execution, multi-format rendering (HTML, PDF, Word, PowerPoint, dashboards, websites, books, slides), native Mermaid diagrams, and built-in bibliography management. See [docs/posit-quarto-guide.md](docs/posit-quarto-guide.md) for the full ecosystem guide.

> **Model note:** This project was developed and tested using **Claude Opus 4.6** and **GPT-5.3-Codex** in GitHub Copilot agent mode. You can switch between models depending on task type and preference. Other models available in Copilot ([model comparison](https://docs.github.com/en/copilot/reference/ai-models/model-comparison)) may also work, but behavior can vary by agent workflow, so validate critical outputs after switching.

> **First time here?** Start with [docs/quick-start.md](docs/quick-start.md),

> or open Copilot Chat and type `@setup` for an interactive guided setup.

> If setup is complete and something is not working, use `@troubleshooter` for diagnostics and issue resolution.

> For the full walkthrough, see [docs/getting-started.md](docs/getting-started.md).

> **Important:** Before using non-setup agents, you must accept the user disclaimer once via `@setup`.

## Philosophy

The question isn't which AI research tool to adopt. It's whether researchers have the right tech stack to build their own.

RWA is a proof of concept for what becomes possible when researchers have a capable IDE, open-source tools, and access to a capable LLM. If your researchers already have VS Code, R, Python, Quarto, Markdown, and git, they have the building blocks. Give them LLM access, and someone on the team could build a custom, organization-compliant research workflow assistant in a few weeks, with MCP servers tailored to their specific needs.

Any specific implementation is going to be opinionated. This one integrates with Zotero, enforces ICMJE authorship guidelines, and defaults to Quarto for reproducible documents. Your organization might make entirely different choices, and that's fine.

What matters is **not locking researchers into a single platform**. Tools that sound impressive but limit you to one documentation system, one output format, one way of working are the kind of constraint that slows people down. The better path: meet researchers where they already are. Make their existing environment AI-capable. Don't replace their tools; connect them.

The stack already exists:

- **[Positron](https://positron.posit.co/) / [VS Code](https://code.visualstudio.com/)** for the IDE

- **[R](https://www.r-project.org/) and [Python](https://www.python.org/)** for analysis

- **[Quarto](https://quarto.org/) and Markdown** for reproducible documents

- **[Git](https://git-scm.com/)** for version control and collaboration

- **[MCP](https://modelcontextprotocol.io/)** for connecting LLMs to structured data sources

The missing piece for most organizations isn't a new product. It's access to a capable LLM within the tools researchers already use, and permission to experiment.

Read the [launch post on LinkedIn](https://www.linkedin.com/posts/andre-van-zyl_evidencesynthesis-systematicreview-publichealth-activity-7437474997612417024-yQ20) for the full backstory.

## Who Is This For?

Any researcher who wants AI-assisted support without surrendering intellectual ownership:

- **NGO and public sector researchers** managing evidence reviews or program evaluations

- **Government analysts** producing policy briefs backed by systematic evidence

- **Academic faculty and postdocs** running systematic reviews or multi-study projects

- **Independent researchers and consultants** needing structured, reproducible workflows

- **Research organizations** wanting standardized, auditable research processes

No PhD required. If you do research, this tool is for you.

## What It Does

| Capability | How |

|---|---|

| **Systematic literature reviews** | `@systematic-reviewer` agent guides PRISMA-compliant workflows: question refinement (PICO/PEO/SPIDER), search strategy development, database searching, screening, data extraction, risk of bias |

| **Academic database access** | MCP servers for PubMed, OpenAlex, Semantic Scholar, Europe PMC, and CrossRef |

| **Reference management** | Zotero MCP server: search library, add items by DOI, tag, organize collections, export BibTeX |

| **Data analysis** | `@data-analyst` agent generates reproducible R or Python analysis scripts in Quarto documents |

| **Academic writing** | `@academic-writer` agent scaffolds IMRaD manuscripts, manages citations, enforces ICMJE AI disclosure |

| **Research planning** | `@research-planner` agent helps with protocols, ethics applications, study design, grant writing |

| **Project management** | `@project-manager` agent tracks phases, milestones, tasks, decisions; generates progress briefs for colleagues |

| **End-to-end orchestration** | `@research-orchestrator` routes workflows across specialist agents, tracks stage progression, and provides ready-to-run handoff prompts |

| **Verification and reproducibility** | `@verification-coordinator` agent co-develops human-friendly, LLM-executable verification workbooks, tracks verifier preferences, and standardizes checkpoint evidence outputs |

| **Troubleshooting and support** | `@troubleshooter` agent diagnoses environment and MCP issues, validates API keys, and provides practical how-to help for day-to-day RWA usage |

| **Development and bug fixes** | `@developer` agent gathers requirements for bug fixes, feature requests, and codebase improvements, then directs to plan mode for implementation |

| **Chat session export** | Export Copilot Chat conversations to QMD for reproducibility via `scripts/export_chat_session.py` or `chat-exporter` MCP server |

| **ICMJE compliance** | Built into every agent: human-in-the-loop mandate, audit trail, AI disclosure generation, authorship checklist |

## Architecture

```mermaid

graph TB

    YOU["👤 You · the Researcher
All decisions · All ownership · All accountability"]

    VSCODE["VS Code + GitHub Copilot Chat"]

    YOU --> VSCODE

    subgraph AGENTS["Specialist AI Agents"]

        direction LR

        ORCH["@research-orchestrator
End-to-end
workflow routing"] ~~~ SR["@systematic-reviewer
PRISMA-compliant
evidence reviews"] ~~~ RP["@research-planner
Protocols &
study design"] ~~~ DA["@data-analyst
Reproducible R / Python
analysis scripts"] ~~~ AW["@academic-writer
Manuscript drafting
& citations"] ~~~ PM["@project-manager
Milestones, decisions
& progress briefs"] ~~~ VC["@verification-coordinator
Human + LLM
verification workbooks"] ~~~ TS["@troubleshooter
Diagnostics &
environment fixes"] ~~~ DEV["@developer
Bug fixes &
feature planning"]

    end

    VSCODE --> AGENTS

    ICMJE["🔒 ICMJE Compliance Layer
Human-in-the-loop · Audit trail · AI disclosure"]

    AGENTS --> ICMJE

    subgraph MCP["MCP Servers · Model Context Protocol"]

        direction LR

        subgraph LITERATURE["Literature Databases"]

            direction LR

            PUB["PubMed
NCBI E-utilities"] ~~~ OA["OpenAlex
REST API"] ~~~ SS["Semantic Scholar
Academic Graph"] ~~~ EPMC["Europe PMC
REST API"] ~~~ CR["CrossRef
DOI metadata"]

        end

        subgraph REFERENCE["Reference Management"]

            direction LR

            ZOT["Zotero Web
API v3"] ~~~ ZLOC["Zotero Local
PDFs & annotations"]

        end

        subgraph TRACKING["Project Tracking"]

            direction LR

            PRISMA["PRISMA Tracker
flow diagrams"] ~~~ PROJ["Project Tracker
tasks & milestones"] ~~~ CHATEX["Chat Exporter
session audit trails"]

        end

    end

    ICMJE --> MCP

    subgraph OUTPUTS["Research Outputs"]

        direction LR

        QMD["📄 Quarto Documents
Manuscripts · Protocols · Reports"] ~~~ SCRIPTS["📊 Analysis Scripts
R · Python · Reproducible"] ~~~ PFLOW["📋 PRISMA Flow
Diagrams & Checklists"] ~~~ BRIEFS["📝 Progress Briefs
Decision logs · Meeting notes"]

    end

    MCP --> OUTPUTS

    %% ── Styles ──

    classDef researcher fill:#2563eb,stroke:#1e40af,color:#fff,font-weight:bold

    classDef vscode fill:#007acc,stroke:#005a9e,color:#fff,font-weight:bold

    classDef agent fill:#7c3aed,stroke:#5b21b6,color:#fff

    classDef compliance fill:#dc2626,stroke:#991b1b,color:#fff,font-weight:bold

    classDef litdb fill:#10b981,stroke:#047857,color:#fff

    classDef refmgmt fill:#14b8a6,stroke:#0d9488,color:#fff

    classDef tracking fill:#06b6d4,stroke:#0891b2,color:#fff

    classDef output fill:#d97706,stroke:#b45309,color:#fff

    class YOU researcher

    class VSCODE vscode

    class ORCH,SR,RP,DA,AW,PM,VC,TS,DEV agent

    class ICMJE compliance

    class PUB,OA,SS,EPMC,CR litdb

    class ZOT,ZLOC refmgmt

    class PRISMA,PROJ tracking

    class QMD,SCRIPTS,PFLOW,BRIEFS output

```

> Also available as [SVG](docs/rwa-architecture.svg), [rendered HTML](docs/architecture-diagram.qmd) (`quarto render docs/architecture-diagram.qmd`), or [Mermaid source](docs/rwa-architecture.mmd).

## ICMJE Compliance: You Are the Author

This tool is designed around the [ICMJE authorship criteria](https://www.icmje.org/recommendations/browse/roles-and-responsibilities/defining-the-role-of-authors-and-contributors.html). AI cannot be an author. You must meet all four criteria:

1. **Substantial contributions** to conception, design, data acquisition, analysis, or interpretation

2. **Drafting or critically revising** the work for important intellectual content

3. **Final approval** of the version to be published

4. **Accountability** for all aspects of the work

The tool enforces this by:

- Requiring human decisions at every substantive step

- Tracking AI contributions in an audit trail (`ai-contributions-log.md`)

- Generating ICMJE-compliant AI disclosure statements for your manuscripts

- Refusing to finalize outputs without explicit human review

Per ICMJE Section II.A.4: AI use must be disclosed in acknowledgments (writing assistance) and methods (data analysis). This tool generates those disclosures for you.

Setup also captures a default author profile in [.rwa-user-config.yaml](.rwa-user-config.yaml), and new projects can store per-project `authors` metadata in [templates/project-config.yaml](templates/project-config.yaml) so future reports and manuscripts start with the correct author front matter.

When RWA itself is cited in a Methods or Acknowledgments section, use the `vanzyl2026rwa` BibTeX entry from [templates/rwa-citation.bib](templates/rwa-citation.bib).

## Disclaimer and Readiness Gate

RWA enforces a disclaimer/readiness gate before non-setup agent workflows.

- Source disclaimer text: [compliance/user-disclaimer.md](compliance/user-disclaimer.md)

- Acceptance state file: [.rwa-user-config.yaml](.rwa-user-config.yaml)

- Required value: `disclaimer_accepted: true` (boolean)

When accepted through `@setup`, `.rwa-user-config.yaml` should include values like:

```yaml

disclaimer_accepted: true

disclaimer_accepted_date: "YYYY-MM-DD"

setup_completed: true

setup_completed_date: "YYYY-MM-DD"

default_author:

  name: "Author Name"

  affiliation:

    name: "Organization"

```

If acceptance is missing or invalid, agents will return:

`Before using RWA, you need to review and accept the disclaimer. Run @setup to get started.`

If you see this message unexpectedly:

1. Confirm [.rwa-user-config.yaml](.rwa-user-config.yaml) exists at workspace root.

2. Confirm `disclaimer_accepted` is boolean `true` (not a quoted string).

3. Run `@setup` again to refresh config if needed.

4. Open a new Copilot Chat session after setup changes.

## Quick Start

What setup includes (typical 20-30 minutes)

- Stage 1: Verify Python and VS Code prerequisites

- Stage 2: Create `.venv` and install all MCP servers

- Stage 3: Configure `.env` API keys and `PROJECTS_ROOT`

- Stage 4: Run setup validation + MCP smoke check

- Stage 5: Confirm servers in VS Code

- Stage 6: Save a default author profile for future outputs

- Stage 7: Optionally start a first project with project-specific authorship metadata

> **Prefer a guided setup?** Open Copilot Chat and type `@setup`. It will

> walk you through every step interactively.

### Prerequisites

| Requirement | Notes |

|---|---|

| [VS Code](https://code.visualstudio.com/) 1.99+ with [GitHub Copilot](https://github.com/features/copilot) | Agent mode must be enabled |

| [Python 3.11+](https://www.python.org/) | Required — runs the MCP servers |

| [R 4.0+](https://www.r-project.org/) | Optional — for R-based analysis templates |

| [Quarto](https://quarto.org/) | Optional — for rendering document templates |

| [Zotero](https://www.zotero.org/) | Optional — for reference management |

### Step 1 — Clone and open the repo

```bash

git clone https://github.com/yourusername/research-workflow-assistant.git

cd research-workflow-assistant

code .

```

### Step 2 — Create a Python environment and install MCP servers

```bash

# Create and activate a virtual environment

python -m venv .venv

# Windows:

& .venv\Scripts\Activate.ps1

# macOS / Linux:

# source .venv/bin/activate

# Install all 11 MCP servers in development mode

pip install -e mcp-servers/_shared \

            -e mcp-servers/pubmed-server \

            -e mcp-servers/openalex-server \

            -e mcp-servers/semantic-scholar-server \

            -e mcp-servers/europe-pmc-server \

            -e mcp-servers/crossref-server \

            -e mcp-servers/zotero-server \

            -e mcp-servers/zotero-local-server \

            -e mcp-servers/prisma-tracker \

            -e mcp-servers/project-tracker \

            -e mcp-servers/chat-exporter \

            -e mcp-servers/bibliography-manager

# Install dev tools (linting, testing)

pip install -e ".[dev]"

```

### Step 3 — Configure API keys

```bash

# Copy the example env file

cp .env.example .env          # macOS / Linux

copy .env.example .env        # Windows

```

Open `.env` and add your credentials. At minimum:

| Key | Where to get it | Required? |

|---|---|---|

| `NCBI_API_KEY` | [NCBI account settings](https://www.ncbi.nlm.nih.gov/account/settings/) | Recommended |

| `OPENALEX_API_KEY` | [OpenAlex API key settings](https://openalex.org/settings/api-key) | Recommended |

| `ZOTERO_API_KEY` | [Zotero key settings](https://www.zotero.org/settings/keys) | If using Zotero |

| `ZOTERO_USER_ID` | Numeric ID shown at the top of the [Zotero keys page](https://www.zotero.org/settings/keys) (not your username) | If using Zotero |

Full details: [docs/api-setup-guide.md](docs/api-setup-guide.md)

`PROJECTS_ROOT` should normally remain `./my_projects` unless you explicitly want projects in another folder.

### Step 4 — Verify everything works

```bash

python scripts/validate_setup.py

```

Need JSON for automation?

```bash

python scripts/validate_setup.py --json

```

Or in VS Code: **Ctrl+Shift+P** → "MCP: List Servers" — all 11 servers should appear.

### Step 5 — Start using it

If you want one entry point that coordinates all phases, start with:

```

@research-orchestrator I am starting a systematic review. Orchestrate the full workflow and tell me exactly which agent prompt to run at each stage.

```

Or choose a specialist agent directly if you already know the stage.

Open Copilot Chat and try an agent:

```

@project-manager Initialize a new project called "my-first-review" in my_projects/my-first-review.

```

See [docs/getting-started.md](docs/getting-started.md) for the full guide, including project setup, multi-project workflows, and cross-workspace usage.

### Usage examples

```

@systematic-reviewer I want to conduct a systematic review on the effectiveness

of community health worker interventions for maternal mental health in low- and

middle-income countries.

```

```

@project-manager Initialize a new project for my systematic review. Target

completion is September 2026.

```

```

@data-analyst I have extracted data from 23 studies. Help me set up a

random-effects meta-analysis using the metafor package in R.

```

### Sample project

The repository includes a fully worked sample project at [`sample_projects/chw-maternal-mental-health/`](sample_projects/chw-maternal-mental-health/) — a systematic review of community health worker interventions for maternal mental health in low- and middle-income countries. It demonstrates the end-to-end outputs that RWA generates:

| Output | Path | What it shows |

|---|---|---|

| Review protocol | [`protocol.qmd`](sample_projects/chw-maternal-mental-health/protocol.qmd) | PRISMA-compliant protocol with PICO framework |

| Manuscript (source) | [`manuscript.qmd`](sample_projects/chw-maternal-mental-health/manuscript.qmd) | IMRaD manuscript with citations and AI disclosure |

| Manuscript (HTML) | [`manuscript.html`](sample_projects/chw-maternal-mental-health/manuscript.html) | Rendered HTML version for browser viewing |

| Manuscript (PDF) | [`manuscript.pdf`](sample_projects/chw-maternal-mental-health/manuscript.pdf) | Rendered PDF for print/submission |

| Manuscript (Word) | [`manuscript.docx`](sample_projects/chw-maternal-mental-health/manuscript.docx) | Rendered DOCX for journal submission or collaboration |

| Search results (SQLite) | [`data/search_results.db`](sample_projects/chw-maternal-mental-health/data/) | Structured database of results from PubMed, OpenAlex, CrossRef, Semantic Scholar |

| Search results (Excel) | [`data/search_results.xlsx`](sample_projects/chw-maternal-mental-health/data/) | Filterable Excel workbook with clickable DOI/PMID hyperlinks |

| Reproducible search scripts | [`scripts/`](sample_projects/chw-maternal-mental-health/scripts/) | Thin stub scripts that reproduce each database search |

| Data extraction | [`data-extraction.qmd`](sample_projects/chw-maternal-mental-health/data-extraction.qmd) | Structured data extraction template |

| Risk of bias | [`rob2-assessments.qmd`](sample_projects/chw-maternal-mental-health/rob2-assessments.qmd) | Cochrane RoB 2 assessments |

| Evidence synthesis | [`synthesis.qmd`](sample_projects/chw-maternal-mental-health/synthesis.qmd) | Narrative and quantitative synthesis |

| PRISMA flow | [`review-tracking/`](sample_projects/chw-maternal-mental-health/review-tracking/) | PRISMA flow diagram tracking data |

| Project tracking | [`project-tracking/`](sample_projects/chw-maternal-mental-health/project-tracking/) | Milestones, tasks, and decision log |

| AI contributions log | [`ai-contributions-log.md`](sample_projects/chw-maternal-mental-health/ai-contributions-log.md) | Full audit trail of AI-assisted work |

| References | [`references.bib`](sample_projects/chw-maternal-mental-health/references.bib) | BibTeX bibliography managed via Zotero |

Browse the sample project to see what a completed RWA-assisted review looks like before starting your own.

## Project Structure

```

research-workflow-assistant/

├── .github/

│   ├── copilot-instructions.md      # ICMJE + research integrity rules

│   └── agents/                      # Custom Copilot agents

├── .vscode/

│   ├── settings.json

│   └── mcp.json                     # MCP server configuration

├── mcp-servers/                     # MCP server implementations (Python)

│   ├── _shared/                     # Shared SQLite result storage module

│   ├── pubmed-server/

│   ├── openalex-server/

│   ├── semantic-scholar-server/

│   ├── europe-pmc-server/

│   ├── crossref-server/

│   ├── zotero-server/

│   ├── prisma-tracker/

│   └── project-tracker/

├── templates/                       # Quarto templates

│   ├── systematic-review/

│   ├── manuscript/

│   ├── report/

│   └── project-management/

├── analysis-templates/              # Reusable R/Python analysis templates

├── compliance/                      # ICMJE checklists, reporting standards

├── docs/                            # User documentation

└── tests/

```

## Database Access

| Database | API | Access | Auth |

|---|---|---|---|

| PubMed/MEDLINE | NCBI E-utilities | Free | API key (recommended) |

| OpenAlex | REST API | Free ($1/day budget) | API key (free) |

| Semantic Scholar | Academic Graph API | Free (rate limited) | API key (optional) |

| Europe PMC | REST API | Free | None |

| CrossRef | REST API | Free | Email (polite pool) |

| Zotero | Web API v3 | Free | API key |

| Scopus | Elsevier API | Planned (institutional) | API key |

Databases without APIs (CINAHL, PsycINFO, Web of Science, Google Scholar, Cochrane Library): the agents help you build database-specific queries, but you run the searches manually and import results.

## Reporting Standards

The tool supports multiple systematic review reporting standards (user selects):

- **PRISMA 2020** (systematic reviews with meta-analysis)

- **PRISMA-ScR** (scoping reviews)

- **MOOSE** (meta-analyses of observational studies)

- **Cochrane Handbook** methods

## Contributing

Contributions are welcome. See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.

## License

[MIT License](LICENSE)

## How To Cite

If you use Research Workflow Assistant in a manuscript, report, protocol, or other cited output, cite it as:

van Zyl, A. (2026). Research Workflow Assistant [Computer software]. https://github.com/andre-inter-collab-llc/research-workflow-assistant

BibTeX:

```bibtex

@misc{vanzyl2026rwa,

  author = {{Van Zyl}, Andre},

  title = {Research Workflow Assistant},

  year = {2026},

  url = {https://github.com/andre-inter-collab-llc/research-workflow-assistant},

  note = {GitHub repository}

}

```

You can also copy the canonical entry directly from [templates/rwa-citation.bib](templates/rwa-citation.bib) into your project's `references.bib`.

## Acknowledgments

- [ICMJE](https://www.icmje.org/) for authorship and AI disclosure guidelines

- [PRISMA](http://www.prisma-statement.org/) for systematic review reporting standards

- [MCP](https://modelcontextprotocol.io/) for the Model Context Protocol specification

- [Quarto](https://quarto.org/) for scientific publishing

- [Posit](https://posit.co/) for the R ecosystem

- Built with [GitHub Copilot](https://github.com/features/copilot) using [Claude Opus 4.6](https://docs.github.com/en/copilot/reference/ai-models/model-comparison) by Anthropic and GPT-5.3-Codex
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/andre-inter-collab-llc/research-workflow-assistant

Awesome Lists containing this project

README