{"id":48565415,"url":"https://github.com/JesseRWeigel/toryo","last_synced_at":"2026-04-24T05:00:28.731Z","repository":{"id":345620603,"uuid":"1186705891","full_name":"JesseRWeigel/toryo","owner":"JesseRWeigel","description":"棟梁 The intelligent agent orchestrator — chains AI coding agents with trust-based delegation, quality ratcheting, and self-improving loops","archived":false,"fork":false,"pushed_at":"2026-04-17T18:14:10.000Z","size":580,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-17T20:23:58.307Z","etag":null,"topics":["agent","ai","aider","automation","claude","cli","llm","ollama","orchestrator","typescript"],"latest_commit_sha":null,"homepage":null,"language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/JesseRWeigel.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-19T22:53:03.000Z","updated_at":"2026-04-17T18:14:14.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/JesseRWeigel/toryo","commit_stats":null,"previous_names":["jesserweigel/toryo"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/JesseRWeigel/toryo","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JesseRWeigel%2Ftoryo","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JesseRWeigel%2Ftoryo/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JesseRWeigel%2Ftoryo/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JesseRWeigel%2Ftoryo/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/JesseRWeigel","download_url":"https://codeload.github.com/JesseRWeigel/toryo/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/JesseRWeigel%2Ftoryo/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32209895,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-24T03:15:14.334Z","status":"ssl_error","status_checked_at":"2026-04-24T03:15:11.608Z","response_time":64,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent","ai","aider","automation","claude","cli","llm","ollama","orchestrator","typescript"],"created_at":"2026-04-08T13:00:22.272Z","updated_at":"2026-04-24T05:00:28.708Z","avatar_url":"https://github.com/JesseRWeigel.png","language":"TypeScript","funding_links":[],"categories":["Autonomous Loop Runners"],"sub_categories":[],"readme":"# 棟梁 Toryo\n\n[![CI](https://github.com/JesseRWeigel/toryo/actions/workflows/ci.yml/badge.svg)](https://github.com/JesseRWeigel/toryo/actions/workflows/ci.yml)\n[![npm](https://img.shields.io/npm/v/toryo-core)](https://www.npmjs.com/package/toryo-core)\n[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)\n[![TypeScript](https://img.shields.io/badge/TypeScript-5.7+-blue)](https://www.typescriptlang.org/)\n\n**The intelligent agent orchestrator.** Not just parallel agents — the full self-improving development loop.\n\n\u003e **棟梁 (toryo)** — Japanese for \"master builder\" or \"foreman.\" The toryo is the person who oversees the entire construction crew, assigns specialists to the right tasks, and ensures every piece meets quality standards before it stays in the structure.\n\nToryo chains multiple AI coding agents (Claude Code, Aider, Gemini CLI, Codex, Ollama) with spec-driven workflows, trust-based delegation, quality ratcheting, and a real-time dashboard.\n\n```bash\nnpx @jweigel/toryo init    # scaffold config + task specs\nnpx @jweigel/toryo run     # start orchestration\n```\n\n**Try instantly:** `npx @jweigel/toryo demo` (no AI tools needed)\n\n**[Documentation](docs/)** | **[Getting Started](docs/getting-started.md)** | **[Configuration](docs/configuration.md)** | **[Bus Pattern](docs/bus-pattern.md)** | **[Contributing](CONTRIBUTING.md)**\n\n![Toryo Run — Full cycle with Ralph Loop retry](docs/images/toryo-run.png)\n\n---\n\n## Why Toryo?\n\nRunning AI agents in parallel is easy. Making them **work together intelligently** is hard.\n\n| Problem | How Toryo solves it |\n|---------|-------------------|\n| Agents produce low-quality output | **Quality ratcheting** — only git-commit results that pass QA (score ≥ threshold). Auto-revert everything else. |\n| No way to retry failures | **Ralph Loop** — failed attempts get QA feedback routed back to the agent for a retry before discarding. |\n| Which agent should do what? | **Trust-based delegation** — agents earn autonomy through consistent scores. New agents start supervised. |\n| Context gets lost between steps | **Smart truncation** — strips boilerplate, preserves substance, feeds optimal context to each phase. |\n| Agent output disappears | **Auto-extraction** — code blocks and skills are automatically saved to disk as the agent produces them. |\n| No visibility into what's happening | **Real-time dashboard** — live event feed, agent status cards, results table, metrics. |\n| One agent isn't enough | **Pluggable adapters** — mix and match Claude Code, Aider, Gemini CLI, Ollama, or any CLI tool. |\n\n## How It Works\n\nToryo runs **cycles**. Each cycle has 4 phases:\n\n```\n📋 Plan → 🔍 Research → ⚡ Execute → ✅ Review\n```\n\n1. **Plan** — An agent reads the task spec and creates a brief\n2. **Research** — An agent gathers context and information\n3. **Execute** — An agent writes code, tests, or documentation\n4. **Review** — An agent scores the output (1-10) and provides feedback\n\nAfter review:\n- **Score ≥ threshold** → `git commit` (keep)\n- **Score \u003c threshold** → `git revert` → Ralph Loop retry → keep or discard\n- **Infrastructure failure** → log as crash, skip scoring\n\nThis is the **ratcheting pattern** from [Karpathy's autoresearch](https://github.com/karpathy/autoresearch): only forward progress gets committed. Bad results are automatically reverted.\n\n## Quick Start\n\n### 1. Initialize\n\n```bash\nnpx @jweigel/toryo init\n```\n\nCreates:\n- `toryo.config.json` — agent definitions, quality gates, delegation rules\n- `specs/` — task specifications (YAML frontmatter + markdown)\n\n### 2. Configure Agents\n\nEdit `toryo.config.json`:\n\n```json\n{\n  \"agents\": {\n    \"researcher\": {\n      \"adapter\": \"claude-code\",\n      \"strengths\": [\"research\", \"analysis\"],\n      \"timeout\": 900\n    },\n    \"coder\": {\n      \"adapter\": \"ollama\",\n      \"model\": \"qwen3.5:27b\",\n      \"strengths\": [\"code\", \"architecture\"],\n      \"timeout\": 900\n    },\n    \"reviewer\": {\n      \"adapter\": \"claude-code\",\n      \"strengths\": [\"review\", \"scoring\"],\n      \"timeout\": 600\n    }\n  }\n}\n```\n\n### 3. Write Task Specs\n\nCreate markdown files in `specs/`:\n\n```markdown\n---\nname: Write Unit Tests\ndifficulty: 0.5\ntags: [testing]\nphases:\n  plan: auto\n  research: auto\n  execute: coder\n  review: reviewer\n---\n\nWrite tests for uncovered modules. Focus on edge cases.\n\n## Acceptance Criteria\n- [ ] Tests cover at least one untested module\n- [ ] All tests pass\n- [ ] Edge cases are covered\n```\n\n### 4. Run\n\n```bash\nnpx @jweigel/toryo run              # run indefinitely\nnpx @jweigel/toryo run -n 10        # run 10 cycles\nnpx @jweigel/toryo run --dry-run    # preview without executing\ntoryo check                # validate config + tools\ntoryo status               # check metrics + agent trust\ntoryo dashboard            # open web dashboard\n```\n\n## Adapters\n\nToryo ships with first-class adapters for 5 tools + a generic adapter for anything else:\n\n| Adapter | Tool | How it works |\n|---------|------|-------------|\n| `claude-code` | [Claude Code](https://claude.ai/code) | `claude --print` (non-interactive) |\n| `aider` | [Aider](https://aider.chat) | `aider --message` |\n| `gemini-cli` | [Gemini CLI](https://github.com/google-gemini/gemini-cli) | `gemini --prompt` |\n| `ollama` | [Ollama](https://ollama.ai) | Direct HTTP API (no CLI needed) |\n| `codex` | [Codex CLI](https://github.com/openai/codex) | `codex --prompt` |\n| `custom` | Any CLI tool | Configurable command + args |\n\nMix and match — use Claude Code for research, Ollama for local code generation, and Gemini for review:\n\n```json\n{\n  \"agents\": {\n    \"researcher\": { \"adapter\": \"claude-code\" },\n    \"coder\": { \"adapter\": \"ollama\", \"model\": \"qwen3.5:27b\" },\n    \"reviewer\": { \"adapter\": \"gemini-cli\" }\n  }\n}\n```\n\n## Trust-Based Delegation\n\nAgents start at **supervised** autonomy and earn trust through consistent high scores:\n\n| Level | Trust | Behavior |\n|-------|-------|----------|\n| 🔴 Supervised | \u003c 0.6 | Strict instruction following, precise format |\n| 🟡 Guided | 0.6–0.8 | Follow spec but suggest improvements |\n| 🟢 Autonomous | ≥ 0.8 | Take initiative, be creative, report after |\n\nTrust is calculated from rolling average scores. An agent that consistently scores 8+/10 earns autonomous mode. An agent that drops below threshold gets demoted back to supervised.\n\n```\nTrust = min(avg_score / 10, 1.0)\n```\n\nWhen a task comes in, Toryo matches it to the best agent based on the task's profile (research-heavy? code-heavy? review?) and each agent's strengths + current trust level.\n\n## Quality Ratcheting\n\nInspired by [Karpathy's autoresearch](https://github.com/karpathy/autoresearch) pattern:\n\n```\nScore ≥ 6.0 → git commit ✓\nScore \u003c 6.0 → git revert → Ralph Loop retry\n                              ↓\n                      Retry passes → git commit ✓\n                      Retry fails  → discard, move on\n```\n\nEvery result is logged to `results.tsv` (Karpathy format):\n\n```\ntimestamp               cycle  task         agent   score  status   description\n2026-03-19T10:15:00Z    42     write-tests  coder   8.2    keep     QA approved: PASS\n2026-03-19T10:45:00Z    43     refactor     coder   4.1    discard  QA rejected: FAIL\n2026-03-19T11:15:00Z    44     security     coder   7.5    keep     QA approved after retry 1: PASS\n```\n\nConfigure thresholds in `toryo.config.json`:\n\n```json\n{\n  \"ratchet\": {\n    \"threshold\": 6.0,\n    \"maxRetries\": 1,\n    \"gitStrategy\": \"commit-revert\"\n  }\n}\n```\n\n## Dashboard\n\nReal-time web dashboard showing agent status, results, and live events:\n\n```bash\nnpx toryo dashboard\n# Opens http://localhost:3456\n```\n\n![Toryo Dashboard](docs/images/dashboard.png)\n\nFeatures:\n- Agent status cards with trust scores and autonomy levels\n- Results table with sortable columns and color-coded status\n- Live event feed via WebSocket\n- Metrics summary (cycles, success rate, avg scores)\n\n## Notifications\n\nGet notified on breakthroughs, failures, and periodic status:\n\n```json\n{\n  \"notifications\": {\n    \"provider\": \"ntfy\",\n    \"target\": \"my-project-toryo\",\n    \"events\": [\"breakthrough\", \"failure\", \"status\"]\n  }\n}\n```\n\nSupported providers: `ntfy`, `slack`, `discord`, `webhook`\n\n## Architecture\n\nToryo is a composable TypeScript monorepo. Use the full orchestrator or individual pieces:\n\n```\n@toryo/core         — Engine: orchestrator, delegation, ratchet, metrics, extraction\n@toryo/adapters     — Agent adapters: claude-code, aider, gemini-cli, ollama, custom\ntoryo               — CLI: init, run, status, dashboard\n```\n\nEach subsystem is a standalone factory function:\n\n```typescript\nimport { createDelegation, createRatchet, createMetrics } from '@toryo/core';\n\n// Use just the delegation system\nconst delegation = createDelegation({ initialTrust: 0.5 });\nconst level = delegation.getAutonomyLevel(agentState);\n\n// Use just the ratchet for git-based quality gates\nconst ratchet = createRatchet({ threshold: 7.0 }, process.cwd());\nif (!ratchet.shouldKeep(review)) await ratchet.revert();\n\n// Use just the metrics for experiment tracking\nconst metrics = createMetrics('.toryo');\nawait metrics.appendResult({ cycle: 1, score: 8.5, status: 'keep', ... });\n```\n\n## Compared to Other Tools\n\nMost multi-agent tools do **one thing** — run agents in parallel (Composio, AMUX) or define specs (Spec Kit). Toryo is the **full loop**: spec → delegate → execute → review → ratchet → improve.\n\n| Feature | Toryo | Composio | AMUX | CrewAI | Spec Kit |\n|---------|-------|----------|------|--------|----------|\n| Multi-agent orchestration | ✅ | ✅ | ✅ | ✅ | ❌ |\n| Heterogeneous CLIs | ✅ 5+ adapters | ✅ 8 slots | ❌ Claude only | ❌ API only | ❌ |\n| Spec-driven workflows | ✅ | ❌ | ❌ | ❌ | ✅ |\n| Trust-based delegation | ✅ | ❌ | ❌ | ❌ | ❌ |\n| Quality ratcheting | ✅ | ❌ | ❌ | ❌ | ❌ |\n| Ralph Loop retries | ✅ | ❌ | ❌ | ❌ | ❌ |\n| Auto-extraction | ✅ | ❌ | ❌ | ❌ | ❌ |\n| results.tsv tracking | ✅ | ❌ | ❌ | ❌ | ❌ |\n| Local model first | ✅ Ollama native | ❌ | ❌ | ❌ | ❌ |\n| Real-time dashboard | ✅ | ✅ | ✅ | ❌ | ❌ |\n\n## Configuration Reference\n\nSee [examples/toryo.config.json](examples/toryo.config.json) for a complete example.\n\n| Field | Type | Default | Description |\n|-------|------|---------|-------------|\n| `name` | string | — | Project name |\n| `agents` | Record | — | Agent definitions (adapter, model, strengths, timeout) |\n| `tasks` | string \\| TaskSpec[] | — | Path to specs dir or inline tasks |\n| `ratchet.threshold` | number | 6.0 | Minimum QA score to keep |\n| `ratchet.maxRetries` | number | 1 | Ralph Loop max retries |\n| `ratchet.gitStrategy` | string | \"commit-revert\" | \"commit-revert\", \"branch-per-task\", or \"none\" |\n| `delegation.initialTrust` | number | 0.5 | Starting trust for new agents |\n| `delegation.scoreWindow` | number | 50 | Rolling window for score averaging |\n| `outputDir` | string | \".toryo\" | Where to store results, metrics, artifacts |\n| `notifications.provider` | string | \"none\" | \"ntfy\", \"slack\", \"discord\", \"webhook\", \"none\" |\n\n## License\n\nMIT\n\n## Credits\n\nBuilt on patterns from:\n- [Karpathy's autoresearch](https://github.com/karpathy/autoresearch) — ratcheting, results.tsv, NEVER STOP\n- [Ralph Loop](https://github.com/vercel-labs/ralph-loop-agent) — verify-then-retry pattern\n- [Intelligent AI Delegation](https://arxiv.org/abs/2602.11865) — trust scoring, capability matching\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FJesseRWeigel%2Ftoryo","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FJesseRWeigel%2Ftoryo","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FJesseRWeigel%2Ftoryo/lists"}