https://github.com/sipyourdrink-ltd/bernstein

Deterministic orchestrator for 30+ CLI AI coding agents. Git worktree isolation, HMAC audit trail, MCP server mode.
https://github.com/sipyourdrink-ltd/bernstein

a2a agent-orchestration agentic-ai agentic-engineering agentic-workflow ai-agents ai-orchestration ai-orchestrator autonomous-agents claude-code coding-agents developer-tools llm-orchestration mcp-server model-context-protocol multi-agent multi-agent-systems open-plugins self-evolving vibe-coding

Last synced: 10 days ago
JSON representation

Deterministic orchestrator for 30+ CLI AI coding agents. Git worktree isolation, HMAC audit trail, MCP server mode.

Host: GitHub
URL: https://github.com/sipyourdrink-ltd/bernstein
Owner: sipyourdrink-ltd
License: apache-2.0
Created: 2026-03-22T14:52:26.000Z (4 months ago)
Default Branch: main
Last Pushed: 2026-04-25T19:31:35.000Z (3 months ago)
Last Synced: 2026-04-25T20:24:04.159Z (3 months ago)
Topics: a2a, agent-orchestration, agentic-ai, agentic-engineering, agentic-workflow, ai-agents, ai-orchestration, ai-orchestrator, autonomous-agents, claude-code, coding-agents, developer-tools, llm-orchestration, mcp-server, model-context-protocol, multi-agent, multi-agent-systems, open-plugins, self-evolving, vibe-coding
Language: Python
Homepage: https://bernstein.run
Size: 33.1 MB
Stars: 192
Watchers: 5
Forks: 24
Open Issues: 20
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Codeowners: .github/CODEOWNERS
- Security: SECURITY.md
- Agents: AGENTS.md

Awesome Lists containing this project

awesome-ai-agents-2026 - Bernstein - 🆕 40+ の CLI 型コーディングエージェント（Claude Code / Codex / Gemini CLI / Cursor / Aider など）を一つにまとめる Python オーケストレーター。LLM は事前プランニング一回だけ使い、スケジューリング・git worktree 隔離・品質ゲート・HMAC 連鎖監査は決定論的。Apache 2.0。![GitHub stars](https://img.shields.io/badge/dynamic/json?label=Stars&query=%24.stargazers_count&url=https%3A%2F%2Fapi.github.com%2Frepos%2Fsipyourdrink-ltd%2Fbernstein&color=yellow&logo=github&logoColor=white&style=flat&cacheSeconds=300) (🏗️ エージェントフレームワーク / その他の標準)
awesome-ai-tools - Bernstein - agent orchestrator for CLI coding agents (Claude Code, Codex CLI, Gemini CLI, and 34 more). Deterministic Python scheduler, first-class MCP server, file-based state, quality gates, cost tracking. Apache-2.0. (Other AI Agents / Multi-Agent / Orchestration Frameworks)
awesome-mcp-zh - Bernstein
fucking-awesome-python - bernstein - A deterministic Python orchestrator for CLI coding agents (Claude Code, Codex, Gemini CLI, and 40+ more) with parallel git worktrees and an HMAC-signed audit chain. (Projects / AI and Agents)
awesome-python - bernstein - A deterministic Python orchestrator for CLI coding agents (Claude Code, Codex, Gemini CLI, and 40+ more) with parallel git worktrees and an HMAC-signed audit chain. (Projects / AI and Agents)
Awesome-independent-tools - Bernstein - (Apache 2.0 开源/自部署) 多 Agent 编排器，协调 Claude Code、Codex CLI、Gemini CLI、OpenHands、Cursor、Aider 等 37 个 CLI 编程 Agent 在并行 git worktree 中工作。确定性 Python 调度器（编排零 LLM token），文件状态、MCP server、质量门、成本追踪。 (工具列表 / AI 资源)
awesome-codex-workflows - sipyourdrink-ltd/bernstein - Deterministic Python orchestrator for many CLI coding agents that isolates runs in Git worktrees, enforces janitor verification and quality gates before merge, and includes a Codex CLI adapter. (Cross-Agent References)
awesome-claude-code-toolkit - Bernstein - driven, deterministic. Apache-2.0 | (Companion Apps & GUIs / GateGuard — Fact-Forcing PreToolUse Gate)
awesome-llmops - Bernstein - class MCP server, quality gates, cost tracking with budgets. | ![GitHub Badge](https://img.shields.io/github/stars/sipyourdrink-ltd/bernstein.svg?style=flat-square) | (Code AI / Vector search)
awesome-agents - Bernstein - Python orchestrator that drives 40+ CLI coding agents (Claude Code, Codex, Gemini CLI, Cursor, Aider) in parallel git worktrees with deterministic scheduling, quality gates, and an HMAC-chained audit log. (Frameworks)
awesome-devops-mcp-servers - sipyourdrink-ltd/bernstein - Multi-agent CLI coding orchestrator with worktree isolation, model routing, quality gates, and audit logging. (CI/CD & DevOps Pipelines / 🤖 Coding Agents)
awesome-ChatGPT-repositories - bernstein - Deterministic orchestrator for 30+ CLI AI coding agents. Git worktree isolation, HMAC audit trail, MCP server mode. (CLIs)
awesome-harness-engineering - bernstein - signed audit chain, signed agent cards, and per-artefact lineage. The zero-LLM coordination loop and tamper-evident audit trail make it the only open-source orchestrator designed for compliance-sensitive agent fleets. ![Stars](https://img.shields.io/github/stars/sipyourdrink-ltd/bernstein?style=flat-square&label=★&color=yellow) (Design Primitives / Task Runners & Orchestration)
awesome-mcp-servers - **sipyourdrink-ltd/bernstein** - Multi-agent orchestrator with first-class MCP server; coordinates 37 CLI coding agents (Claude Code, Codex, Gemini CLI, OpenHands, Aider, and 32 more) through MCP tool calls. Apache 2.0. `http` `ai` `git` `github` `logging` (🤖 AI/ML)
awesome-agents - Bernstein - chained audit log of every step. Apache-2.0. ![GitHub Repo stars](https://img.shields.io/github/stars/sipyourdrink-ltd/bernstein?style=social) (Software Development)

README

          


  

  

  




> *"To achieve great things, two things are needed: a plan and not quite enough time."* — Leonard Bernstein

### orchestrate any AI coding agent. any model. one command.

[![CI](https://github.com/sipyourdrink-ltd/bernstein/actions/workflows/ci.yml/badge.svg)](https://github.com/sipyourdrink-ltd/bernstein/actions/workflows/ci.yml)

[![PyPI](https://img.shields.io/pypi/v/bernstein)](https://pypi.org/project/bernstein/)

[![Python 3.12+](https://img.shields.io/badge/python-3.12+-3776ab?logo=python&logoColor=white)](https://python.org)

[![License](https://img.shields.io/github/license/sipyourdrink-ltd/bernstein)](LICENSE)

[![MseeP.ai](https://img.shields.io/badge/MseeP.ai-verified-2496ed)](https://mseep.ai/app/chernistry-bernstein)

[![CodeTrendy](https://img.shields.io/badge/CodeTrendy-listed-FBBF24)](https://codetrendy.com/listing/bernstein)

[website](https://bernstein.run) · [docs](https://bernstein.readthedocs.io/) · [install](docs/getting-started/install.md) · [first run](docs/getting-started/first-run.md) · [enterprise eval](docs/ENTERPRISE.md) · [glossary](docs/reference/GLOSSARY.md) · [limitations](docs/reference/KNOWN_LIMITATIONS.md) · [sponsor](https://github.com/sponsors/chernistry)



---

Bernstein is a deterministic Python scheduler that runs a crew of CLI coding agents (Claude Code, Codex, Gemini CLI, and 40 more) against a single goal in parallel git worktrees, with an HMAC-signed audit chain over every step.

### why this exists

i wrote bernstein because i was paying $400/month in claude bills running three coding agents in parallel and getting nondeterministic merges.

as of 2026-05-08: 296 stars, 35 forks, ~3,769 pypi downloads/day (mostly bots; ~54k/month), apache 2.0, solo maintained, no funding. numbers will drift; the file is the source-of-truth date.

### install in 30 seconds

```bash

pipx install bernstein

bernstein init

bernstein run -g "fix the failing test in tests/test_foo.py"

```

## sponsor

if bernstein routed a model that saved you a claude bill, $25 covers a month of my coffee.

[github.com/sponsors/chernistry →](https://github.com/sponsors/chernistry)

tier ladder, escalation thresholds, and what each tier gets you live at [bernstein.run/sponsors](https://bernstein.run/sponsors).

## who this is for

specific shapes where the value lands:

- engineering teams running ≥3 cli coding agents in parallel — each agent gets its own git worktree, the merge queue serialises landings, no race conditions

- regulated or on-prem environments — every routing decision is in plain text, the audit log is hmac-signed and tamper-evident, no saas hop, no third-party data plane

- platform teams that need an audit log of agent decisions — the orchestrator writes one row per scheduling decision, you can grep it

- anyone burning more than $1k/mo on cursor/aider/claude-max who wants determinism — you can replay yesterday's plan and get yesterday's task graph

- forward-deployed engineers dropping into a client repo — credentials stay in your env, not the client's; agents you spawn are whichever cli tool the client already trusts

if you nodded at two of those bullets, this fits.

## who this is NOT for

equally specific. these are the cases where you should pick something else:

- "i want one pair-programmer to chat with about my code" — claude code or cursor alone. bernstein adds orchestration overhead you don't need

- prototypes where merge gates are overkill — the lint/types/tests/cross-model-review pipeline is value when the cost of a bad merge is real, friction when you're throwing the repo away on friday

- non-coding tasks (research, writing, data analysis pipelines) — bernstein wraps cli coding agents specifically, not generic llm workflows. crewai or autogen are the right shape there

- anyone who wants a saas wrapper with a credit card form — bernstein is on-prem only by design. if you want managed, this is the wrong project, not the wrong fit

- teams that need a vendor with a support sla and a contract — solo open-source project. github issues are how support happens

- research-shape "let the agents collaborate emergently" use cases — the deterministic scheduler is a hard wall there

### how it compares

| Feature                                | Bernstein   | Archon   | LangGraph |

|----------------------------------------|-------------|----------|-----------|

| Deterministic scheduler (no LLM in loop) | yes       | no       | no        |

| Multi-agent crew (parallel adapters)   | yes         | one      | yes       |

| Signed lineage / audit chain           | yes         | no       | no        |

| Air-gap / sovereign deploy             | yes         | partial  | no        |

| Visual workflow YAML                   | yes [^yaml] | yes      | no        |

| Hosted dashboard / SaaS                | no          | partial  | no        |

[^yaml]: Workflow YAML support lands with [PR #1108](https://github.com/sipyourdrink-ltd/bernstein/pull/1108) (in this batch). Until then, plans are authored as Python or via `bernstein run plan.yaml` against the legacy schema.

A longer feature matrix against CrewAI, AutoGen, LangGraph, and the four CLI-agent orchestrators that share Bernstein's category lives in the [Detailed comparison](#detailed-comparison) section below.

---

### what is this, in one paragraph

You tell Bernstein what you want built. It splits the work across several AI coding agents, runs them in parallel inside isolated git worktrees, records every handoff in an HMAC-chained audit log, runs the tests, and merges the code that actually passes. You come back to a green PR.

Forward-deployed engineering, on a swarm. Drop Bernstein into a client repo and you get a multi-agent crew with file-based state, per-agent credential scoping, and a signed audit trail running on whichever CLI agents the client already trusts.

### other install methods

```bash

curl -fsSL https://bernstein.run/install.sh | sh        # macOS / Linux one-liner

irm https://bernstein.run/install.ps1 | iex             # Windows PowerShell

pip install bernstein                                   # pip

uv tool install bernstein                               # uv

brew tap chernistry/tap && brew install bernstein       # Homebrew

```

See the full [install matrix](#install) for `dnf copr`, `npx`, optional extras, and the wheelhouse path for air-gapped sites.

### why the scheduler is plain Python

Most agent orchestrators use an LLM to decide who does what. That is non-deterministic and burns tokens on scheduling instead of code. Bernstein does one LLM call to break down your goal, then the rest (running agents in parallel, isolating their git branches, running tests, routing retries) is plain Python. Every run is reproducible. Every step is logged and replayable.

No framework to learn. No vendor lock-in. Swap any agent, any model, any provider.



What you see while it runs:

```

$ bernstein -g "Add JWT auth"

[manager] decomposed into 4 tasks

[agent-1] claude-sonnet: src/auth/middleware.py  (done, 2m 14s)

[agent-2] codex:         tests/test_auth.py      (done, 1m 58s)

[verify]  all gates pass. merging to main.

```

### YAML workflow manifests (optional)

When the open-ended `bernstein run -g ""` is too coarse-grained, the

`bernstein workflow` family runs a declarative DAG of agent / command / loop

nodes. Manifests are plain YAML, validated up-front, and dispatched through

the same `AgentSpawner` the rest of Bernstein uses. No parallel spawn path,

no LLM in the scheduler.

```bash

bernstein workflow list                       # bundled + user-installed

bernstein workflow run idea-to-pr -g "Add JWT auth"

bernstein workflow init my-flow               # scaffold a starter manifest

bernstein workflow validate path/to/flow.yaml

```

Stock workflows that ship with the wheel:

| Name                  | What it does                                         |

| --------------------- | ---------------------------------------------------- |

| `idea-to-pr`          | research → plan → implement → tests → PR             |

| `refactor-with-tests` | find target → propose → implement → loop until green |

| `security-review`     | scan → triage → patch → adversary review             |

| `doc-update`          | audit → update → docs build                          |

| `dependency-bump`     | bump → install → tests-loop → smoke                  |

| `hot-fix`             | reproduce → fix → regression loop → changelog        |

Loop nodes re-fire until a bash predicate exits 0 (`pytest -x` is a typical

one). `fresh_context: true` mints a new agent session per iteration. The

`interactive: true` flag is reserved for the approval-gate work tracked in

ticket #1110 and currently raises a clear `NotImplementedError`.

## use cases

- forward-deployed engineering — drop the swarm onto a client repo when you arrive, take it with you when you leave.

- self-evolving projects — point Bernstein at its own repo and let it execute the backlog (this codebase is one).

- CI fleets — run a swarm of agents in parallel on PRs, with per-agent credential scoping and signed audit trail.

- air-gapped / regulated deployment — install from a signed wheelhouse, run with `--profile airgap` to deny outbound by default, allow-list specific destinations as needed. See [Air-gap installation](docs/installation/air-gap.md).

## supported agents

Bernstein auto-discovers installed CLI agents. Mix them in the same run. Cheap local models for boilerplate, heavier cloud models for architecture.

42 CLI agent adapters: 39 third-party wrappers, 2 leaf-node delegators (Composio, Ralphex), plus a generic wrapper for anything with `--prompt`.

| Agent | Models | Install |

|-------|--------|---------|

| [Claude Code](https://docs.anthropic.com/en/docs/claude-code) | Opus 4, Sonnet 4.6, Haiku 4.5 | `npm install -g @anthropic-ai/claude-code` |

| [Codex CLI](https://github.com/openai/codex) | GPT-5, GPT-5 mini | `npm install -g @openai/codex` |

| [OpenAI Agents SDK v2](https://openai.github.io/openai-agents-python/) | GPT-5, GPT-5 mini, o4 | `pip install 'bernstein[openai]'` |

| [GitHub Copilot CLI](https://docs.github.com/en/copilot/github-copilot-in-the-cli) | Copilot-managed (GPT-5, Sonnet 4.6) | `npm install -g @github/copilot` |

| [Gemini CLI](https://github.com/google-gemini/gemini-cli) | Gemini 2.5 Pro, Gemini Flash | `npm install -g @google/gemini-cli` |

| [Cursor](https://www.cursor.com) | Sonnet 4.6, Opus 4, GPT-5 | [Cursor app](https://www.cursor.com) |

| [Devin Terminal](https://devin.ai) (Cognition) | Devin-managed | `curl -fsSL https://cli.devin.ai/install.sh \| bash` then `devin auth login` |

| [Aider](https://aider.chat) | Any OpenAI/Anthropic-compatible | `pip install aider-chat` |

| [Amp](https://ampcode.com) | Amp-managed | `npm install -g @sourcegraph/amp` |

| [CLM gateway](docs/adapters/clm.md) (sovereign / on-prem LLM) | Any OpenAI-compatible CLM endpoint | `pip install aider-chat`, then set `CLM_ENDPOINT` / `CLM_TOKEN` |

| [Cody](https://sourcegraph.com/cody) | Sourcegraph-hosted | `npm install -g @sourcegraph/cody` |

| [Continue](https://continue.dev) | Any OpenAI/Anthropic-compatible | `npm install -g @continuedev/cli` (binary: `cn`) |

| [Goose](https://block.github.io/goose/) | Any provider Goose supports | See [Goose docs](https://block.github.io/goose/) |

| [IaC](https://www.terraform.io/) (Terraform/Pulumi) | Any provider the base agent uses | Built-in |

| [Junie](https://junie.jetbrains.com) | BYOK (Anthropic, OpenAI, Google, xAI, OpenRouter, Copilot) | `curl -fsSL https://junie.jetbrains.com/install.sh \| bash` |

| [Kilo](https://kilo.dev) | Kilo-hosted | See [Kilo docs](https://kilo.dev) |

| [Kiro](https://kiro.dev) | Kiro-hosted | See [Kiro docs](https://kiro.dev) |

| [AWS Q Developer](https://docs.aws.amazon.com/amazonq/latest/qdeveloper-ug/command-line.html) | Amazon Q-managed (Claude-backed) | `brew install --cask amazon-q` then `q login` |

| [Ollama](https://ollama.ai) + Aider | Local models (offline) | `brew install ollama` |

| [OpenCode](https://opencode.ai) | Any provider OpenCode supports | See [OpenCode docs](https://opencode.ai) |

| [Qwen](https://github.com/QwenLM/qwen-code) | Qwen Code models | `npm install -g @qwen-code/qwen-code` |

| [Cloudflare Agents](https://developers.cloudflare.com/agents/) | Workers AI models | `bernstein cloud login` |

| [OpenHands](https://github.com/OpenHands/OpenHands) | Any LiteLLM-supported (Anthropic, OpenAI, ...) | `uv tool install openhands --python 3.12` |

| [Open Interpreter](https://github.com/OpenInterpreter/open-interpreter) | Any (LiteLLM-backed) | `pip install open-interpreter` |

| [gptme](https://github.com/gptme/gptme) | Anthropic, OpenAI, OpenRouter | `pipx install gptme` |

| [Plandex](https://github.com/plandex-ai/plandex) | Plandex Cloud or self-hosted models | `curl -sL https://plandex.ai/install.sh \| bash` |

| [AIChat](https://github.com/sigoden/aichat) | OpenAI, Anthropic, OpenRouter, Groq, Gemini | `cargo install aichat` |

| [Letta Code](https://github.com/letta-ai/letta-code) | Letta-routed (Anthropic, OpenAI) | `npm install -g @letta-ai/letta-code` |

| **Generic** | Any CLI with `--prompt` | Built-in |

#### orchestrator delegation (leaf-node)

A separate, smaller class of adapters that wrap **other CLI orchestrators** as if they were single agents. Bernstein hands the wrapped tool a prompt or plan and only sees the final exit code; sub-agent costs and quality gates inside the wrapped orchestrator are not visible to Bernstein. Useful when you want to drop an existing workflow built on one of these tools into a step of a larger Bernstein plan.

| Orchestrator | Wrapped as | Install |

|--------------|------------|---------|

| [Composio Agent Orchestrator](https://github.com/ComposioHQ/agent-orchestrator) (`@aoagents/ao`) | `composio` | `npm install -g @aoagents/ao` |

| [umputun/ralphex](https://github.com/umputun/ralphex) | `ralphex` | `go install github.com/umputun/ralphex/cmd/ralphex@latest` |

Any adapter also works as the **internal scheduler LLM**. Run the entire stack without any specific provider:

```yaml

internal_llm_provider: gemini            # or qwen, ollama, codex, goose, ...

internal_llm_model: gemini-3.1-pro

```

> [!TIP]

> Run `bernstein --headless` for CI pipelines. No TUI, structured JSON output, non-zero exit on failure.

## quick start

```bash

cd your-project

bernstein init                    # creates .sdd/ workspace + bernstein.yaml

bernstein -g "Add rate limiting"  # agents spawn, work in parallel, verify, exit

bernstein live                    # watch progress in the TUI dashboard

bernstein stop                    # graceful shutdown with drain

```

For multi-stage projects, define a YAML plan:

```bash

bernstein run plan.yaml           # skips LLM planning, goes straight to execution

bernstein run --dry-run plan.yaml # preview tasks and estimated cost

```

## how it works

1. **Decompose**. The manager breaks your goal into tasks with roles, owned files, and completion signals.

2. **Spawn**. Agents start in isolated git worktrees, one per task. Main branch stays clean.

3. **Verify**. The janitor checks concrete signals: tests pass, files exist, lint clean, types correct.

4. **Merge**. Verified work lands in main. Failed tasks get retried or routed to a different model.

The orchestrator is a Python scheduler, not an LLM. Scheduling decisions are deterministic, auditable, and reproducible.

## cloud execution (Cloudflare)

Bernstein can run agents on Cloudflare Workers instead of locally. The `bernstein cloud` CLI handles deployment and lifecycle.

- **Workers**. Agent execution on Cloudflare's edge, with Durable Workflows for multi-step tasks and automatic retry.

- **V8 sandbox isolation**. Each agent runs in its own isolate, no container overhead.

- **R2 workspace sync**. Local worktree state syncs to R2 object storage so cloud agents see the same files.

- **Workers AI** (experimental). Use Cloudflare-hosted models as the LLM provider, no external API keys required.

- **D1 analytics**. Task metrics and cost data stored in D1 for querying.

- **Browser rendering**. Headless Chrome on Workers for agents that need to inspect web output.

- **MCP remote transport**. Expose or consume MCP servers over Cloudflare's network.

```bash

bernstein cloud login      # authenticate with Bernstein Cloud

bernstein cloud deploy     # push agent workers

bernstein cloud run plan.yaml  # execute a plan on Cloudflare

```

A `bernstein cloud init` scaffold for `wrangler.toml` and bindings is planned.

## capabilities

**Core orchestration**. Parallel execution, git worktree isolation, janitor verification, quality gates (lint, types, PII scan), cross-model code review, circuit breaker for misbehaving agents, token growth monitoring with auto-intervention.

**Intelligence**. Contextual bandit router for model/effort selection. Knowledge graph for codebase impact analysis. Semantic caching saves tokens on repeated patterns. Cost anomaly detection (burn-rate alerts). Behavior anomaly detection with Z-score flagging.

**Sandboxing**. Pluggable [`SandboxBackend`](docs/architecture/sandbox.md) protocol; run agents in local git worktrees (default), Docker containers, [E2B](https://e2b.dev) Firecracker microVMs, or [Modal](https://modal.com) serverless containers (with optional GPU). Plugin authors can register custom backends through the `bernstein.sandbox_backends` entry-point group. Inspect installed backends with `bernstein agents sandbox-backends`.

**Artifact storage**. `.sdd/` state can stream to pluggable [`ArtifactSink`](docs/architecture/storage.md) backends: local filesystem (default), S3, Google Cloud Storage, Azure Blob, or Cloudflare R2. `BufferedSink` keeps the WAL crash-safety contract by writing locally with fsync first and mirroring to the remote asynchronously.

**Skill packs**. Progressive-disclosure [skills](docs/architecture/skills.md) (OpenAI Agents SDK pattern): only a compact skill index ships in every spawn's system prompt, agents pull full bodies via the `load_skill` MCP tool on demand. 17 built-in role packs plus third-party `bernstein.skill_sources` entry-points.

**Controls**. HMAC-chained audit logs, policy engine, PII output gating, WAL-backed crash recovery (experimental multi-worker safety), OAuth 2.0 PKCE. SSO/SAML/OIDC support is in progress.

**Observability**. Prometheus `/metrics`, OTel exporter presets, Grafana dashboards. Per-model cost tracking (`bernstein cost`). Terminal TUI and web dashboard. Agent process visibility in `ps`.

**Ecosystem**. MCP server mode, A2A protocol support, GitHub App integration, pluggy-based plugin system, multi-repo workspaces, cluster mode for distributed execution, self-evolution via `--evolve` (experimental).

Full feature matrix: [FEATURE_MATRIX.md](docs/reference/FEATURE_MATRIX.md) · Recent features: [What's New](docs/whats-new.md)

## what's new in v1.9

**ACP bridge**. `bernstein acp serve --stdio` exposes Bernstein to any editor that speaks the Agent Communication Protocol (Zed, etc.). No plugin code needed on the editor side.

**Autonomous CI repair**. `bernstein autofix` watches open Bernstein PRs and, when CI turns red, spawns a fixer agent automatically. Once green, it pushes the fix and re-requests review.

**Credential vault**. `bernstein connect ` writes API keys to the OS keychain; `bernstein creds` lists and rotates them. Agents inherit scoped credentials without touching environment variables.

**Preview tunnels**. `bernstein preview start` boots a sandboxed dev server and prints a public URL. Useful for sharing a running branch with a reviewer without deploying to staging.

Full changelog: [docs/whats-new.md](docs/whats-new.md)

## operator commands

Commands that eliminate the glue code most teams end up writing around their runs.

| Command | What it does |

|---------|--------------|

| `bernstein pr` | Auto-creates a GitHub PR from a completed session; body carries the janitor's gate results and token/USD cost breakdown. |

| `bernstein from-ticket ` | Imports a Linear / GitHub Issues / Jira ticket as a Bernstein task. Label-based role + scope inference. Supports `--dry-run` and `--run`. |

| `bernstein ticket import ` | Alias / group form of `from-ticket` for scripting. |

| `bernstein remote` | SSH sandbox backend. `remote test `, `remote run  `, `remote forget `. ControlMaster socket reuse for fast repeat calls. |

| `bernstein hooks` | Lifecycle hooks for `pre_task`, `post_task`, `pre_merge`, `post_merge`, `pre_spawn`, `post_spawn`; shell scripts or pluggy `@hookimpl`s. `hooks list`, `hooks run `, `hooks check`. |

| `bernstein chat serve --platform=telegram\|discord\|slack` | Drive runs from chat with `/run`, `/status`, `/approve`, `/reject`, `/switch`, `/stop`. |

| `bernstein approve-tool` / `bernstein reject-tool` | Interactive mid-run tool-call approval. `--latest`, `--id`, `--always`. |

| `bernstein tunnel start  [--provider auto\|cloudflared\|ngrok\|bore\|tailscale]` | One wrapper around four tunnel providers. Also `tunnel list`, `tunnel stop \|--all`. ControlMaster-style process reuse. |

| `bernstein daemon install [--user\|--system] [--command="..."] [--env KEY=VAL]...` | Installs a systemd (Linux) or launchd (macOS) unit for auto-start. Also `daemon start/stop/restart/status/uninstall`. |

| `bernstein connect ` / `bernstein creds` | Stores and rotates API credentials in the OS keychain. Agents inherit scoped keys per-run. |

| `bernstein autofix` | Daemon that monitors open Bernstein PRs; spawns a fixer agent when CI fails and pushes the repair automatically. |

| `bernstein preview start` | Starts a sandboxed dev server for the current branch and prints a shareable public tunnel URL. |

| `bernstein agents-md` | Generates a canonical [AAIF AGENTS.md](https://agents.md) for the repo and rewrites it into each CLI's native shape. `generate` (preview), `write` (single file), `sync` (canonical + Cursor `.cursor/rules/*.mdc` + Claude `CLAUDE.md` + Aider `CONVENTIONS.md` + Goose `.goosehints`), `verify` (CI gate), `diff` (shows drift between canonical IR and on-disk files). |

### retrieval & caching: what's actually under the hood

Bernstein deliberately uses **no neural embeddings, no vector databases, and no

external embedding APIs**. There are two retrieval/caching layers, both

keyword/lexical:

- **Codebase RAG** (`core/knowledge/rag.py`); SQLite FTS5 with BM25 ranking

  and AST-aware chunking for Python files. Built incrementally on file mtime;

  used to enrich agent task context within token budgets.

- **Semantic cache** (`core/knowledge/semantic_cache.py`); despite the name,

  fuzzy matching is done with TF (term-frequency) cosine similarity over word

  counts, not learned embeddings. It deduplicates near-identical LLM planning

  and agent-output requests so we don't re-spawn agents for the same goal.

If you need real semantic retrieval (vector DB, neural embeddings), wire it

yourself via the retrieval role/skill in `templates/`; nothing in core

performs vector search.

## detailed comparison

| Feature | Bernstein | CrewAI | AutoGen [^autogen] | LangGraph |

|---------|-----------|--------|---------|-----------|

| Orchestrator | Deterministic code | LLM-driven (+ code Flows) | LLM-driven | Graph + LLM |

| Works with | Any CLI agent (43 adapters) | Python SDK classes | Python agents | LangChain nodes |

| Git isolation | Worktrees per agent | No | No | No |

| Pluggable sandboxes | Worktree, Docker, E2B, Modal | No | No | No |

| Verification | Janitor + quality gates | Guardrails + Pydantic output | Termination conditions | Conditional edges |

| Cost tracking | Built-in | `usage_metrics` | `RequestUsage` | Via LangSmith |

| State model | File-based (.sdd/) | In-memory + SQLite checkpoint | In-memory | Checkpointer |

| Remote artifact sinks | S3, GCS, Azure Blob, R2 | No | No | No |

| Self-evolution | Built-in (experimental) | No | No | No |

| Declarative plans (YAML) | Yes | Yes (`agents.yaml`, `tasks.yaml`) | No | Partial (`langgraph.json`) |

| Model routing per task | Yes | Per-agent LLM | Per-agent `model_client` | Per-node (manual) |

| MCP support | Yes (client + server) | Yes | Yes (client + workbench) | Yes (client + server) |

| Agent-to-agent chat | Bulletin board | Yes (Crew process) | Yes (group chat) | Yes (supervisor, swarm) |

| Web UI | TUI + web dashboard | CrewAI AMP | AutoGen Studio | LangGraph Studio + LangSmith |

| Cloud hosted option | Yes (Cloudflare) | Yes (CrewAI AMP) | No | Yes (LangGraph Cloud) |

| Built-in RAG/retrieval | Yes (codebase FTS5 + BM25) | `crewai_tools` | `autogen_ext` retrievers | Via LangChain |

*Last verified: 2026-04-19. See [full comparison pages](docs/compare/README.md) for detailed feature matrices.*

The table above compares Bernstein against LLM-orchestration frameworks (they orchestrate LLM calls). The table below covers the closer category: other tools that orchestrate **CLI coding agents**:

| Feature | Bernstein | [awslabs/cli-agent-orchestrator](https://github.com/awslabs/cli-agent-orchestrator) | [ComposioHQ/agent-orchestrator](https://github.com/ComposioHQ/agent-orchestrator) | [emdash](https://github.com/generalaction/emdash) | [umputun/ralphex](https://github.com/umputun/ralphex) |

|---------|-----------|-----------|-----------|-----------|-----------|

| Shape | Python CLI + library + MCP server | Python CLI + tmux sessions + web UI | TypeScript CLI + local dashboard | Electron desktop app | Go CLI |

| Primary language | Python | Python | TypeScript | TypeScript | Go |

| Install | `pipx install bernstein` | `uv tool install cli-agent-orchestrator` | `npm install -g @aoagents/ao` | `.dmg` / `.msi` / `.AppImage` | `go install` / single binary |

| Agent adapters | 42 | 5 (Kiro, Claude Code, Codex, Gemini, Kimi) | 3 (Claude Code, Codex, Aider) | 24 | 1 (Claude Code only) |

| Parallel multi-agent execution | Yes | Yes (tmux session per agent) | Yes | Yes | No (single sequential session) |

| Git worktree per agent | Yes | No (planned, [#100](https://github.com/awslabs/cli-agent-orchestrator/issues/100)) | Yes | Yes | Optional `--worktree` flag |

| MCP server mode (exposes self as MCP) | Yes (stdio + HTTP/SSE) | Yes (inter-agent comms) | No | No | No |

| Coordinator | Deterministic Python scheduler | Hierarchical LLM supervisor | LLM-driven | Not documented | Linear plan executor |

| HMAC-chained audit replay | Yes | No | No | No | No |

| Cross-model verifier / quality gates | Yes (multi-stage) | No | No | No | Multi-phase review (Claude only) |

| Autonomous CI-fix / PR flow | Yes (`bernstein autofix`) | No | Yes | No | No |

| Visual dashboard | TUI + web | Web UI + tmux | Web | Desktop app | Web (`--serve`) |

| Notification sinks | Telegram/Slack/Discord/Email/Webhook/Shell | — | No | No | Telegram / Email / Slack / Webhook |

| Backing | Solo OSS | AWS Labs | Funded (Composio.dev) | YC W26 | Solo OSS |

| License | Apache 2.0 | Apache 2.0 | MIT | Apache 2.0 | MIT |

Bernstein's wedge in this category: Python-native, MCP-server-first, widest adapter coverage, true multi-agent parallelism, deterministic scheduler with no LLM in the coordination loop. If you want AWS-aligned tmux-session isolation with a hierarchical LLM supervisor, AWS Labs' `cao` is a closer fit; if your stack is TypeScript and you want a product with a dashboard, Composio's `@aoagents/ao` is a better fit; if you want a polished desktop ADE, emdash is; if you only use Claude Code and want a single Go binary that walks a plan top-to-bottom, ralphex is. If you want a primitive that imports into Python, exposes itself over MCP to any client, runs many agents in parallel, and covers the full agent breadth (including Qwen, Goose, Ollama, OpenAI Agents SDK, Cloudflare Agents, and more), Bernstein.

## what people use it for

These are real workflow patterns from Bernstein's own docs, examples, and project surface, not invented customer quotes.

- **Parallel test generation**. Fan out across untested modules with `BERNSTEIN_MAX_AGENTS=5 bernstein -g "Generate unit tests for untested modules in src/"`.

- **CI failure repair**. Watch open PRs and dispatch scoped fixers with `bernstein autofix start --repo your-org/your-repo --foreground`.

- **PR review follow-up**. Turn review comments into tracked fix tasks with `bernstein review-responder start --repo your-org/your-repo --foreground`.

- **Codebase modernization**. Run wide refactors like `BERNSTEIN_MAX_AGENTS=8 bernstein -g "Migrate callback-based modules in src/ to async/await and update tests"`.

- **Ticket-to-run workflows**. Import GitHub, Jira, or Linear work directly with `bernstein from-ticket https://github.com/your-org/your-repo/issues/123 --run`.

- **API-change safety checks**. Catch downstream breakage before merge with `bernstein dep-impact --base main`.

See [Who Uses Bernstein](docs/use-cases.md) for the longer version with command examples and notes on when each workflow fits.

[^autogen]: AutoGen is in maintenance mode; successor is Microsoft Agent Framework 1.0.

## monitoring

```bash

bernstein live       # TUI dashboard

bernstein dashboard  # web dashboard

bernstein status     # task summary

bernstein ps         # running agents

bernstein cost       # spend by model/task

bernstein doctor     # pre-flight checks

bernstein recap      # post-run summary

bernstein trace  # agent decision trace

bernstein run-changelog --hours 48  # changelog from agent-produced diffs

bernstein explain   # detailed help with examples

bernstein dry-run    # preview tasks without executing

bernstein dep-impact # API breakage + downstream caller impact

bernstein aliases    # show command shortcuts

bernstein config-path    # show config file locations

bernstein init-wizard    # interactive project setup

bernstein debug-bundle   # collect logs, config, and state for bug reports

bernstein skills list    # discoverable skill packs (progressive disclosure)

bernstein skills show   # print a skill body with its references

```

```bash

bernstein fingerprint build --corpus-dir ~/oss-corpus  # build local similarity index

bernstein fingerprint check src/foo.py                 # check generated code against the index

```

## install

| Method | Command |

|--------|---------|

| **One-liner (macOS / Linux)** | `curl -fsSL https://bernstein.run/install.sh \| sh` |

| **One-liner (Windows)** | `irm https://bernstein.run/install.ps1 \| iex` |

| **pip** | `pip install bernstein` |

| **pipx** | `pipx install bernstein` |

| **uv** | `uv tool install bernstein` |

| **Homebrew** | `brew tap chernistry/tap && brew install bernstein` |

| **Fedora / RHEL** | `sudo dnf copr enable alexchernysh/bernstein && sudo dnf install bernstein` |

| **npm** (wrapper) | `npx bernstein-orchestrator` |

The one-liner scripts check for Python 3.12+, bootstrap pipx when it's missing, fix PATH for the current session, and install (or upgrade) `bernstein`. They handle brew-managed macOS environments and the Windows `py -3` launcher fallback. Script sources: [install.sh](scripts/install.sh) · [install.ps1](scripts/install.ps1).

### optional extras

Provider SDKs are optional so the base install stays lean. Pick what you need:

| Extra | Enables |

|-------|---------|

| `bernstein[openai]` | OpenAI Agents SDK v2 adapter (`openai_agents`) |

| `bernstein[docker]` | Docker sandbox backend |

| `bernstein[e2b]` | [E2B](https://e2b.dev) microVM sandbox backend (needs `E2B_API_KEY`) |

| `bernstein[modal]` | [Modal](https://modal.com) sandbox backend, optional GPU (needs `MODAL_TOKEN_ID` / `MODAL_TOKEN_SECRET`) |

| `bernstein[s3]` | S3 artifact sink (via `boto3`) |

| `bernstein[gcs]` | Google Cloud Storage artifact sink |

| `bernstein[azure]` | Azure Blob artifact sink |

| `bernstein[r2]` | Cloudflare R2 artifact sink (S3-compatible `boto3`) |

| `bernstein[grpc]` | gRPC bridge |

| `bernstein[k8s]` | Kubernetes integrations |

Combine extras with brackets, e.g. `pip install 'bernstein[openai,docker,s3]'`.

Editor extensions: [VS Marketplace](https://marketplace.visualstudio.com/items?itemName=alex-chernysh.bernstein) · [Open VSX](https://open-vsx.org/extension/alex-chernysh/bernstein)

## contributing

PRs welcome. See [CONTRIBUTING.md](CONTRIBUTING.md) for setup and code style.

## support

If Bernstein saves you time: [GitHub Sponsors](https://github.com/sponsors/chernistry)

Contact: [forte@bernstein.run](mailto:forte@bernstein.run)

## featured in

Curated lists, newsletters, and peer projects that picked up Bernstein:

- [**Python Weekly #742**](https://www.pythonweekly.com/p/python-weekly-issue-742-april-23-2026) (April 23, 2026); newsletter mention.

- [**Future Digest**](https://futuredigestnews.substack.com/p/your-claude-bill-just-hit-874-heres) (April 30, 2026); Bernstein cited as the self-host orchestrator for long-running autonomous sessions in a cost-cutting playbook.

- [**Augment Code — 9 Open-Source Agent Orchestrators for AI Coding (2026)**](https://www.augmentcode.com/tools/open-source-agent-orchestrators); editorial roundup; "the most architecturally interesting tool in this roundup."

- [**nibzard/awesome-agentic-patterns**](https://github.com/nibzard/awesome-agentic-patterns/blob/main/patterns/deterministic-zero-llm-orchestration.md); Bernstein cited as the production implementation of the "deterministic zero-LLM orchestration" pattern.

- [**Jenqyang/Awesome-AI-Agents**](https://github.com/Jenqyang/Awesome-AI-Agents)

- [**jamesmurdza/awesome-ai-devtools**](https://github.com/jamesmurdza/awesome-ai-devtools)

- [**jim-schwoebel/awesome_ai_agents**](https://github.com/jim-schwoebel/awesome_ai_agents)

- [**Piebald-AI/awesome-gemini-cli**](https://github.com/Piebald-AI/awesome-gemini-cli)

- [**ComposioHQ/awesome-codex-skills**](https://github.com/ComposioHQ/awesome-codex-skills)

- [**jxzhangjhu/Awesome-LLM-RAG**](https://github.com/jxzhangjhu/Awesome-LLM-RAG)

- [**rohitg00/awesome-claude-code-toolkit**](https://github.com/rohitg00/awesome-claude-code-toolkit)

- [**numtide/llm-agents.nix**](https://github.com/numtide/llm-agents.nix); Nix flake distribution.

More awesome lists & community curation

- [andyrewlee/awesome-agent-orchestrators](https://github.com/andyrewlee/awesome-agent-orchestrators)

- [bradAGI/awesome-cli-coding-agents](https://github.com/bradAGI/awesome-cli-coding-agents)

- [milisp/awesome-codex-cli](https://github.com/milisp/awesome-codex-cli)

- [yaolifeng0629/Awesome-independent-tools](https://github.com/yaolifeng0629/Awesome-independent-tools) (中文 + EN)

- [caramaschiHG/awesome-ai-agents-2026](https://github.com/caramaschiHG/awesome-ai-agents-2026)

- [ai-for-developers/awesome-vibe-coding](https://github.com/ai-for-developers/awesome-vibe-coding)

- [killop/anything_about_game](https://github.com/killop/anything_about_game) (`AI.md`)

- [Glama MCP Catalog](https://glama.ai/mcp/servers/sipyourdrink-ltd/bernstein); editorial MCP server listing.

- Mirrors: [icopy-site/awesome](https://github.com/icopy-site/awesome), [icopy-site/awesome-cn](https://github.com/icopy-site/awesome-cn), [trackawesomelist/trackawesomelist](https://github.com/trackawesomelist/trackawesomelist).

Cited as prior art by peer projects

- [**mkb23/overcode**](https://github.com/mkb23/overcode/blob/main/docs/design/bakeoffs/overcode-vs-bernstein.md); long-form bakeoff treating Bernstein as the reference implementation.

- [**Vintersong/NOVA-Cognition-Framework**](https://github.com/Vintersong/NOVA-Cognition-Framework); `BERNSTEIN_PATTERNS.md`, "Patterns Worth Borrowing".

- [**AJV009/drupal-contrib-workbench**](https://github.com/AJV009/drupal-contrib-workbench); research notes on the manager/janitor split.

- [**danielvaughan/codex-blog**](https://github.com/danielvaughan/codex-blog/blob/main/_posts/2026-04-09-loki-mode-autonomous-execution.md); comparison article positioning Bernstein on the deterministic end.

## license

[Apache License 2.0](LICENSE)

---

Made with love by [Alex Chernysh](https://alexchernysh.com) · [GitHub](https://github.com/chernistry) · [X](https://x.com/alex_chernysh) · [bernstein.run](https://bernstein.run)

## translations

[Español](docs/i18n/README.es.md) · [中文](docs/i18n/README.zh.md) · [العربية](docs/i18n/README.ar.md) · [Português](docs/i18n/README.pt.md) · [Bahasa Indonesia](docs/i18n/README.id.md) · [Français](docs/i18n/README.fr.md) · [日本語](docs/i18n/README.ja.md) · [Русский](docs/i18n/README.ru.md) · [Deutsch](docs/i18n/README.de.md) · [עברית](docs/i18n/README.he.md) · [יידיש](docs/i18n/README.yi.md)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/sipyourdrink-ltd/bernstein

Awesome Lists containing this project

README