An open API service indexing awesome lists of open source software.

https://github.com/codethread/pandoras-box

yet another agent orchestrator
https://github.com/codethread/pandoras-box

Last synced: 4 days ago
JSON representation

yet another agent orchestrator

Awesome Lists containing this project

README

          

# Pandora's Box

> A local control plane where you talk to **Pandora** and she releases her Evils — Envy, Toil, Greed, War — to do the work.

## About

You chat with **Pandora** in her tmux session. She queues up work; her Evils
claim it, run in their own Claude or Pi harness sessions, and report back.
You don't run shell commands in the loop — Pandora does.

**Key features**

- **One conversation, many runs.** Pandora is the one you talk to. Her Evils
come and go as work demands.
- **Durable memory.** Every task, design, and decision is stored. Crash
recovery and audit are free, and Pandora gets sharper as context
accumulates.
- **Replaceable harnesses.** Claude Code and Pi both work today.
- **tmux-based control plane.** HITL Evils live in named tmux sessions you
can attach to.

**The Evils**

| Evil | Mode | Claims | Role |
| ------- | ---- | ------------------- | -------------------------------------------------------- |
| Pandora | HITL | `escalate` | Long-lived. Talks to you. Routes the chain. |
| Envy | AFK | `intake`, `clarify` | Classifies external signals and clarifies requirements. |
| Toil | AFK | `triage` | Decomposes incoming work; routes to design or execute. |
| Greed | HITL | `design`, `review` | Produces design briefs and explicitly requested reviews. |
| War | AFK | `execute` | Runs in a repo/worktree and changes code. |

**Built with**

- [tmux](https://github.com/tmux/tmux) — HITL session host
- [Claude Code](https://docs.claude.com/claude-code) and [Pi](https://pi.dev/) — harness runtimes

## Getting Started

### Prerequisites

- Node `24.15.0` (pinned via Volta in root `package.json`)
- `pnpm` v10+
- macOS or Linux, Git
- `tmux` and `claude` (Claude Code CLI) or `pi` on PATH

### Installation

```sh
git clone https://github.com/codethread/pandoras-box.git
cd pandoras-box
pnpm install
pnpm run build
```

`pnpm run build` links the public `pithos`, `pdx`, and `pandora-spawn` bins onto your global PATH. The private test-only `fagent` bin is built for repo-local integration use but is not globally linked.

If `pnpm`'s global link doesn't work on your setup (Nix, restricted PATH,
etc.), use the Makefile to symlink the bins into `~/.local/bin` directly:

```sh
make local
```

Requires `~/.local/bin` to be on your `PATH`.

### Configuration

Repo defaults live in [`./resources/`](./resources/) and are documented in
[`./resources/README.md`](./resources/README.md).

Run `pdx init` to create the data dir and seed the bundle-owned canonical
config before Pandora starts:

- `/agents.toml`
- `/templates/`
- `/AGENTS.md`

`pdx init` and `pdx open` always re-seed those bundle-owned files from repo
defaults.

User customisation lives in `/`, where `` is
`$PDX_USER_DATA_DIR` or defaults to `/config`. That directory keeps a
scaffold-once `AGENTS.md`, `CLAUDE.md`, `artifacts.toml`, `supervisor.toml`, and `agents.toml`
plus a re-seeded `PANDORA.md` reference so you can `cd` into it and ask a
direct harness session to edit config safely.

Typical files:

- `/AGENTS.md` — tiny user-owned pointer to `PANDORA.md`
- `/CLAUDE.md` — same pointer for Claude direct sessions
- `/agents.toml` — scaffolded user-wide policy registry and Harness partial
- `/artifacts.toml` — user-owned Artifact Contracts scaffold (commented guidance only)
- `/supervisor.toml` — user-owned pdx launch policy scaffold; set `enforce_repo_root_trunk = false` under `[launch_preconditions]` here to disable the repo Scope default-branch guard
- `/PANDORA.md` — installed config reference, overwritten on `pdx init` / `pdx open`

Customize behavior with named policy packs declared in
`/agents.toml` and stored in user-owned `policies/*.md` files.
Use `policy.add` / `policy.remove`, Agent-specific policy selection, and ordered
match rules for project-specific behavior. User config must choose Harness launch
settings for Agents before they can launch. Supervisor launch preconditions live
in `/supervisor.toml`, not Agent prompt policy packs: by default,
pdx blocks repo Scope launches when the repository root is not on its remote
default branch and creates a `launch_precondition` Repair Alert for Pandora to
resolve/replay. External producers can feed Envy by writing intake events to
`/intake.sock` while `pdx open` is running.

You can also ask an agent to reconfigure Pandora's Box for you:

```sh
pdx init
cd ~/.pdx/config
claude
# or your preferred harness
```

Use `PANDORA.md` in that user config directory as the main guide; `AGENTS.md`
is only the tiny direct-agent pointer. Validate changes with
`pandora-spawn preview`.

Useful reset modes:

Artifact status/rejection is an alpha schema break. If an existing DB fails with an incompatible `artifacts` schema error, reset runtime state with `pdx init --clean` / `pdx open --clean`; standalone Pithos users can run `pithos init --fresh`.

- `pdx init` or `pdx open` — re-seed `/agents.toml`, `/templates/`, `/AGENTS.md`, and `/PANDORA.md`; scaffold missing `/AGENTS.md`, `/CLAUDE.md`, `/agents.toml`, `/artifacts.toml`, and `/supervisor.toml`; keep user config, DB, runs, and logs
- `pdx init --clean` or `pdx open --clean` — wipe runtime state only (DB, runs, logs); keep bundle-owned config and user config
- `pdx init --nuke` or `pdx open --nuke` — wipe pdx-owned runtime/bundled state, preserve ``, then reseed fresh canonical config

### Uninstall

The supervisor writes its data directory to `~/.pdx`. Remove the bins and
that directory:

```sh
rm -rf ~/.pdx
# if installed via pnpm:
pnpm -r unlink
# if installed via `make local`:
rm ~/.local/bin/{pithos,pdx,pandora-spawn}
```

## Usage

Two commands:

```sh
pdx init # create editable config without starting Pandora
pdx open # release the Evils
pdx close # back in the jar
```

First time meeting her, get the lay of the land:

- _"Tell me about yourself."_
- _"Tell me about scopes, tasks and chains."_
- _"How do we get work done around here?"_

Then put work into the queue through her:

- _"Create a design task in the frontend repo to figure out a WebSocket
implementation."_
- _"Build out a spec with Greed for our new persistence layer, then queue a
task for Toil to break it down and delegate execution in a worktree of the
backend repo."_

Every conversation deposits durable context — scopes, tasks, artifacts,
chain edges — that survives runs. Old work stays queryable, so the next
delegation needs less re-explaining. Pandora gets sharper as you go.

Reviews are explicit work, not automatic gates: ask Pandora or Toil to queue a
`review` task when you want Greed to walk through scoped work with you.

When something goes sideways, she also drives the cleanup:

- _"Go kill Greed, she's chasing the wrong plan."_
- _"Toil's stuck — interrupt her and re-triage."_

**Repair after interruption**: killing an Evil mid-task interrupts the Run,
marks the Held task failed, and creates a Repair Alert. Pandora repairs the
Broken chain; pdx never resurrects a dead Agent as the same Run.

If Pandora herself is wedged, `pdx --help` lists the raw escape hatches.

## Validation and integration tests

Fast local validation stays outside containers:

```sh
pnpm test
pnpm run build
```

`pnpm test` runs the Vitest unit/package suites; it does not require real Harness credentials, Podman, or host tmux state. Use `pnpm lint` and `pnpm typecheck` for focused preflight checks.

Podman-backed integration commands exercise container-local tmux and isolated pdx/Pithos data dirs without touching the host tmux server. They require Podman:

```sh
pnpm run test:integration:tmux
pnpm run test:integration:pdx-open-fagent
```

`test:integration:tmux` builds `containers/Containerfile.integration`, mounts the current working tree into the container, sets isolated `PDX_DATA_DIR`, `PDX_USER_DATA_DIR`, `PITHOS_DB`, and `TMUX_TMPDIR`, then proves tmux can create, list, and kill a session through a container-local socket.

`test:integration:pdx-open-fagent` copies the repo into the container, builds repo-local bins, configures Pandora/Toil/War with explicit `/workspace/packages/fagent/bin/fagent` argv paths, then drives `pdx open` through Toil triage, War failure, Pandora replay in the original `pdx--pandora` tmux pane, War completion, and `pdx close`. `tmux respawn-pane` is not an acceptable shortcut for this MVP path. `fagent` is test-only; normal user config should use Claude or Pi.

On failure, the script prints `pdx open fagent integration artifacts preserved at `. Inspect `/data/pdx.jsonl`, `/data/fagent-events.jsonl`, `/data/runs/*.stdout.log`, `/data/runs/*.stderr.log`, and `/user-config/`.

`pnpm verify` runs the full gate in this order: lint, typecheck, unit tests, workspace build, `test:integration:tmux`, then `test:integration:pdx-open-fagent`.

## Roadmap

Pre-v1; expect breaking changes.

- [ ] First-class recipes — named, repeatable workflows the Evils follow for common shapes of work
- [ ] Promote/demote an Evil between AFK and HITL mid-session
- [ ] Interactive pickers for kill/show/transcript so you don't copy ids by hand
- [ ] Pluggable control-plane backends — swap tmux for Zellij, remote SSH, etc. (the architecture is already decoupled)
- [ ] Broader control-plane integration scenarios — extend the Podman/fagent flow beyond the current triage, execute-failure, Repair Alert replay, and completion path

See [open issues](https://github.com/codethread/pandoras-box/issues).

## Contributing

See [`CONTRIBUTING.md`](./CONTRIBUTING.md), plus:

- `UBIQUITOUS_LANGUAGE.md` — shared domain terms.
- `AGENTS.md` — engineering rules for coding agents working on this repo.
- `specs/README.md` — design specs index.
- `packages/*/README.md` — per-package docs.

## Licence

MIT — see [`LICENCE`](./LICENCE).

## Acknowledgements

> [...] after a while I realized I just wanted someone to talk to, while the system was working. And perhaps, as occasion might demand, someone to yell at.
>
> — Steve Yegge: [Gas Town: from Clown Show to v1.0](https://steve-yegge.medium.com/gas-town-from-clown-show-to-v1-0-c239d9a407ec)

- The Effect community for the patterns this codebase leans on.
- All the AI researchers building amazing LLMs.