{"id":50276752,"url":"https://github.com/saadmsft/nanoresearch","last_synced_at":"2026-05-27T21:01:54.693Z","repository":{"id":357474125,"uuid":"1237131130","full_name":"saadmsft/nanoresearch","owner":"saadmsft","description":"Tri-level co-evolving multi-agent research automation — a faithful re-implementation of arXiv:2605.10813 with a ChatGPT-style web UI.","archived":false,"fork":false,"pushed_at":"2026-05-12T22:57:17.000Z","size":775,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-13T00:29:46.517Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://saadmsft.github.io/nanoresearch/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/saadmsft.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"docs/security.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-12T22:51:39.000Z","updated_at":"2026-05-12T22:57:33.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/saadmsft/nanoresearch","commit_stats":null,"previous_names":["saadmsft/nanoresearch"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/saadmsft/nanoresearch","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/saadmsft%2Fnanoresearch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/saadmsft%2Fnanoresearch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/saadmsft%2Fnanoresearch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/saadmsft%2Fnanoresearch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/saadmsft","download_url":"https://codeload.github.com/saadmsft/nanoresearch/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/saadmsft%2Fnanoresearch/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33583399,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-27T02:00:06.184Z","response_time":53,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-05-27T21:01:53.664Z","updated_at":"2026-05-27T21:01:54.687Z","avatar_url":"https://github.com/saadmsft.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003c!-- markdownlint-disable MD033 MD041 --\u003e\n\n\u003cdiv align=\"center\"\u003e\n\n# 🔬 NanoResearch\n\n**A tri-level co-evolving multi-agent research automation system.**\n\n_Re-implementation of [NanoResearch (arXiv:2605.10813)](https://arxiv.org/abs/2605.10813) with a ChatGPT-style web UI and field-agnostic prompts._\n\n[![License: Apache 2.0](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](LICENSE)\n[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/)\n[![FastAPI](https://img.shields.io/badge/FastAPI-0.115-009688?logo=fastapi)](https://fastapi.tiangolo.com/)\n[![React](https://img.shields.io/badge/React-18-61DAFB?logo=react)](https://react.dev/)\n[![Tests](https://img.shields.io/badge/tests-61_passing-brightgreen.svg)](#testing)\n[![Docs](https://img.shields.io/badge/docs-online-blue.svg)](https://saadmsft.github.io/nanoresearch/)\n\n[**📖 Documentation**](https://saadmsft.github.io/nanoresearch/) ·\n[**🏗 Architecture**](https://saadmsft.github.io/nanoresearch/architecture/) ·\n[**🚀 Quickstart**](#quickstart) ·\n[**📄 Original paper**](https://arxiv.org/abs/2605.10813)\n\n\u003c/div\u003e\n\n---\n\n\u003e _\"Automation **for whom**? Researchers operate under different resource configurations, hold different methodological preferences, and target different output formats. A system that produces uniform outputs regardless of these differences will systematically under-serve every individual user.\"_\n\u003e\n\u003e — NanoResearch (Xu et al., 2026)\n\nNanoResearch takes a one-line research idea — in **any scholarly field** — and rides it through ideation, planning, experimentation, analysis, writing, and review to produce a downloadable LaTeX paper, while **learning your preferences** so the next run feels more like _you_.\n\n## ✨ Highlights\n\n- 💬 **Single chat surface.** No buttons. Tell it your field and a topic; it narrates the pipeline back to you and pauses for feedback at every stage.\n- 🌍 **Field-agnostic.** Biology, social sciences, engineering, mathematics, computer science — prompts adapt to the field's conventions (regressions vs. proofs vs. case studies vs. ablations).\n- 🧠 **Tri-level co-evolution.** Per-user **Skill Bank** (procedural rules), **Memory Module** (project-specific facts), and a planner adapter trained via **SDPO** from your free-form feedback.\n- 🔬 **Real artefacts.** Generates and **runs** a Python project for empirical fields, parses results, then assembles a section-by-section LaTeX paper that compiles to PDF (when `pdflatex` is installed).\n- 🔐 **Azure AD auth.** Talks to your private GPT-5.1 deployment via `DefaultAzureCredential` — no API keys in `.env`.\n\n\u003cdiv align=\"center\"\u003e\n\n### Pipeline at a glance\n\n```mermaid\nflowchart LR\n  T[Topic] --\u003e O((Orchestrator))\n  O --\u003e|retrieve| SB[(Skill Bank 𝒮)]\n  O --\u003e|retrieve| MM[(Memory Module ℳ)]\n  O --\u003e I[Ideation]:::s1\n  I --\u003e P[Planning]:::s1\n  P --\u003e C[Coding +\u003cbr/\u003eExecution]:::s2\n  C --\u003e A[Analysis]:::s2\n  A --\u003e W[Writing]:::s3\n  W --\u003e R[Review]:::s3\n  R --\u003e Paper[paper.pdf]\n  W -.-\u003e|distil| SB\n  W -.-\u003e|distil| MM\n  I -.-\u003e|narrations| U[Chat]\n  P -.-\u003e U\n  C -.-\u003e U\n  A -.-\u003e U\n  W -.-\u003e U\n  U -.-\u003e|feedback ℱ| O\n  classDef s1 fill:#1e3a8a,stroke:#3b82f6,color:#fff\n  classDef s2 fill:#92400e,stroke:#f59e0b,color:#fff\n  classDef s3 fill:#065f46,stroke:#10b981,color:#fff\n```\n\n\u003c/div\u003e\n\n## 🗂 What's in the box\n\n| Component | Folder | Purpose |\n|---|---|---|\n| 🧠 **Backend (Python)** | [`src/nanoresearch/`](src/nanoresearch/) | Multi-agent pipeline, FastAPI server, Skill/Memory stores, SDPO trainer |\n| 💬 **Frontend (React + Vite)** | [`ui/`](ui/) | Chat-first UI with assistant-ui–style bubbles and live SSE narrations |\n| 📚 **Documentation site** | [`docs/`](docs/) | Jekyll-friendly markdown; deployed to GitHub Pages |\n| 🖼 **Diagrams** | [`docs/assets/diagrams/`](docs/assets/diagrams/) | Mermaid sources + rendered PNGs |\n| 🧪 **Tests** | [`tests/`](tests/) | 61 unit + integration tests, offline-runnable |\n\n## 🚀 Quickstart\n\n### Prerequisites\n\n- Python **3.11+** (3.12 tested)\n- Node **18+** (Vite + assistant-ui)\n- An **Azure OpenAI / Foundry** deployment of GPT-5.1 (or a compatible reasoning model)\n- `az login` performed locally; your account needs the **Cognitive Services OpenAI User** role\n- _(optional)_ `pdflatex` or `tectonic` for PDF compilation — otherwise the paper ships as `.tex`\n- _(optional)_ Apple-Silicon Mac with 32 GB+ unified RAM for the local Qwen planner (SDPO)\n\n### Setup\n\n```bash\n# 1. Clone + venv\ngit clone https://github.com/saadmsft/nanoresearch.git\ncd nanoresearch\npython3.12 -m venv .venv\nsource .venv/bin/activate\npip install -e \".[dev]\"\n\n# 2. Configure Azure (AAD auth — no API keys)\ncp .env.example .env\n# edit AZURE_OPENAI_ENDPOINT + AZURE_OPENAI_DEPLOYMENT\naz login\n\n# 3. Backend\nnanoresearch serve            # http://127.0.0.1:8000\n\n# 4. Frontend (separate terminal)\ncd ui \u0026\u0026 npm install \u0026\u0026 npm run dev   # http://localhost:5173\n```\n\nOpen \u003chttp://localhost:5173\u003e and say hi.\n\n### Optional: local SDPO planner\n\n```bash\npip install -e \".[local]\"     # torch, transformers, peft, accelerate\nhuggingface-cli download Qwen/Qwen2.5-7B-Instruct \\\n  --local-dir data/models/Qwen2.5-7B-Instruct\n```\n\n## 💬 What it looks like\n\n```text\nYou ▸ I'm Mia, an ecologist. I prefer field studies, 6-month timeline.\n      Start a run on canopy cover and breeding-bird richness in city parks.\n\nNanoResearch ▸ Nice to meet you, Mia. Starting on canopy cover + bird richness.\nNanoResearch ▸ 🔎 Searching scholarly databases…\nNanoResearch ▸ 📚 Done. 12 papers.\nNanoResearch ▸ 💡 Drafted hypotheses (n=6). Checking novelty next.\nNanoResearch ▸ 🎯 Going with: Canopy × heterogeneity interaction predicts richness.\nNanoResearch ▸ ⏸ Paused at ideation — what should I emphasise or change?\n\nYou ▸ Keep the design simple and proceed.\n\nNanoResearch ▸ 📐 Drafting an experiment blueprint…\nNanoResearch ▸ 👀 Running an internal peer review of the blueprint…\nNanoResearch ▸ 🧪 Writing a small experiment project to test the plan…\nNanoResearch ▸ ▶️ Running the experiment…\nNanoResearch ▸ 📈 Run finished (ok=True exit=0 dur=3.2s).\nNanoResearch ▸ 📊 Analysing results…\nNanoResearch ▸ ✍️ Drafting the introduction / method / experiments / … sections.\nNanoResearch ▸ 👓 Reviewing the paper draft.\nNanoResearch ▸ 📄 Paper compiled. [Download PDF] — or the [LaTeX source].\n```\n\n## 🏛 Architecture\n\nNanoResearch is a **stage pipeline** orchestrated around two persistent stores and one trainable planner.\n\n```mermaid\nflowchart TB\n  subgraph User[\"👤 User\"]\n    Chat[\"💬 Chat UI\"]\n  end\n\n  subgraph API[\"⚡ FastAPI\"]\n    Intent[\"/api/intent\u003cbr/\u003eNL → action\"]\n    RunMgr[\"RunManager\u003cbr/\u003ebackground thread\"]\n    Narr[\"Narrator\u003cbr/\u003eevent → English\"]\n    SSE[\"SSE /stream\u003cbr/\u003e+ narration\"]\n    Files[\"paper.pdf\u003cbr/\u003epaper.tex\"]\n  end\n\n  subgraph Pipe[\"🔬 Pipeline (Orchestrator)\"]\n    direction LR\n    I[Ideation] --\u003e P[Planning]\n    P --\u003e C[Coding]\n    C --\u003e An[Analysis]\n    An --\u003e W[Writing]\n  end\n\n  subgraph Stores[\"💾 Per-User Stores\"]\n    Profile[(Profile)]\n    Skill[(Skill Bank 𝒮)]\n    Mem[(Memory ℳ)]\n    LoRA[(LoRA adapter)]\n  end\n\n  subgraph LLMs[\"🤖 Models\"]\n    Azure[Azure OpenAI\u003cbr/\u003eGPT-5.1]\n    Qwen[Qwen2.5-7B\u003cbr/\u003elocal · planner only]\n  end\n\n  Chat \u003c--\u003e|HTTP| Intent\n  Chat \u003c--\u003e|EventSource| SSE\n  Chat --\u003e|download| Files\n  Intent --\u003e RunMgr\n  RunMgr --\u003e Pipe\n  Pipe --\u003e Narr --\u003e SSE\n  Pipe \u003c--\u003e|retrieve / distil| Stores\n  Pipe --\u003e|complete| Azure\n  Pipe --\u003e|plan| Qwen\n  Qwen \u003c--\u003e|SDPO LoRA| LoRA\n```\n\n📖 **Full architecture deep-dive:** [docs/architecture.md](docs/architecture.md) · [paper §3 mapping](docs/paper-mapping.md) · [SDPO math](docs/sdpo.md)\n\n## 🧪 Testing\n\n```bash\npytest -m \"not azure and not local_model\"   # 61 offline tests\npytest -m azure                              # AAD smoke\npytest -m local_model                         # Qwen MPS smoke\n```\n\n| Suite | Tests |\n|---|---|\n| Config + manifest + router | 9 |\n| Stores (schemas, retrieval, distill) | 18 |\n| Orchestrator | 8 |\n| Stage I (literature + ideation + planning) | 9 |\n| Stage II + III (sandbox, narrator, TeX, schemas) | 9 |\n| HTTP API | 7 |\n| SDPO (gradient + LoRA) | 3 (opt-in) |\n| Azure / local smoke | 2 (opt-in) |\n\n## 📂 Repository layout\n\n```text\nnanoresearch/\n├── src/nanoresearch/\n│   ├── agents/           # Stage I-III stage controllers + prompts + artefacts\n│   ├── api/              # FastAPI app, RunManager, intent classifier, narrator\n│   ├── cli/              # `nanoresearch serve`, `health`, `settings`\n│   ├── config/           # pydantic-settings\n│   ├── literature/       # OpenAlex client + evidence extraction\n│   ├── llm/              # Azure (AAD) + local Qwen backends, agent-role router\n│   ├── logging/          # structlog + per-run JSONL manifest\n│   ├── orchestrator/     # Retrieve → Plan → Dispatch → Reflect → Update\n│   ├── planner/          # Qwen wrapper + LoRA + SDPO trainer (Eq. 14-15)\n│   ├── schemas/          # Profile / Skill / Memory pydantic models\n│   └── stores/           # SkillBank + MemoryStore + Profile (JSON-backed)\n├── ui/                   # React + TypeScript + Tailwind chat\n├── docs/                 # Jekyll site (GitHub Pages)\n├── tests/                # Pytest suite\n├── runs/                 # ← created at runtime (event logs + papers/\u003crun\u003e/paper.tex)\n└── data/users/\u003cid\u003e/      # ← created at runtime (profile, skills, memories, lora)\n```\n\n## 🗺 Roadmap\n\n- [x] Phase 0–4 — bootstrap, stores, planner+SDPO, orchestrator, Stage I (Ideation + Planning)\n- [x] Phase 5 — Stage II (Coding + sandboxed exec + debug loop) + Analysis\n- [x] Phase 6 — Stage III (Writing + Reviewer + LaTeX/PDF)\n- [x] FastAPI + React/Vite UI with live SSE narrations\n- [ ] Phase 7 — Compliance/Novelty/Writing judges (paper §8–10) + 20-topic benchmark harness\n- [ ] Phase 8 — CLI ergonomics (`nanoresearch run`, `nanoresearch eval`)\n- [ ] Docker sandbox upgrade for Stage II\n- [ ] Per-section figure generation + bibliography auto-fill\n\n## 📜 Citation\n\nIf this implementation is useful in your research, please cite the original paper:\n\n```bibtex\n@misc{xu2026nanoresearch,\n  title  = {NanoResearch: Co-Evolving Skills, Memory, and Policy for Personalized Research Automation},\n  author = {Xu, Jinhang and Zhu, Qiyuan and Wu, Yujun and Wang, Zirui and Zhang, Dongxu and others},\n  year   = {2026},\n  eprint = {2605.10813},\n  archivePrefix = {arXiv},\n  primaryClass  = {cs.AI},\n  url    = {https://arxiv.org/abs/2605.10813}\n}\n```\n\n## 📄 License\n\nApache 2.0 — see [LICENSE](LICENSE).\n\nOriginal NanoResearch paper © Xu et al., 2026.  \nThis implementation is independent and not affiliated with the original authors.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsaadmsft%2Fnanoresearch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsaadmsft%2Fnanoresearch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsaadmsft%2Fnanoresearch/lists"}