{"id":50911071,"url":"https://github.com/xbrianh/gremlins","last_synced_at":"2026-06-16T10:02:23.943Z","repository":{"id":355678534,"uuid":"1227397486","full_name":"xbrianh/gremlins","owner":"xbrianh","description":"Background coding agents that execute, review, and land work unattended.","archived":false,"fork":false,"pushed_at":"2026-06-07T17:14:50.000Z","size":2867,"stargazers_count":2,"open_issues_count":9,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-07T17:14:56.710Z","etag":null,"topics":["agent-orchestration","ai-coding-assistant-tools","automation","background-jobs","claude-code","coding-agents","llm-agents"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/xbrianh.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-05-02T16:18:34.000Z","updated_at":"2026-06-07T15:23:57.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/xbrianh/gremlins","commit_stats":null,"previous_names":["amorphous-industries/gremlins","xbrianh/gremlins"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/xbrianh/gremlins","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xbrianh%2Fgremlins","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xbrianh%2Fgremlins/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xbrianh%2Fgremlins/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xbrianh%2Fgremlins/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/xbrianh","download_url":"https://codeload.github.com/xbrianh/gremlins/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xbrianh%2Fgremlins/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34400456,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-16T02:00:06.860Z","response_time":126,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent-orchestration","ai-coding-assistant-tools","automation","background-jobs","claude-code","coding-agents","llm-agents"],"created_at":"2026-06-16T10:02:23.284Z","updated_at":"2026-06-16T10:02:23.927Z","avatar_url":"https://github.com/xbrianh.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# gremlins\n\nBackground coding-agent pipelines that plan, implement, review, and land work\nend-to-end. Given a goal or GitHub issue, a gremlin runs the full\nplan → implement → review-code → address-code cycle unattended, writing\nartifacts to the per-user state directory resolved by\n`platformdirs.user_state_dir(\"gremlins\")` and optionally opening a pull\nrequest. A fleet manager tracks running, stalled, and finished gremlins and\nprovides stop / land / close operations.\n\n**Status: brand-new and a bit janky.** This is a fresh project, actively\nshaped by daily use. Expect rough edges — stream timeouts, the occasional\nmerge conflict from parallel gremlins, a few stages still finding their\nfinal shape. Bug reports, ideas, and PRs are all welcome.\n\n---\n\n## Using gremlins with a coding assistant\n\nPaste the output of `gremlins prompt-for-assistant` into a fresh Claude Code session (or any compatible assistant) to configure it as a competent gremlins collaborator.\n\nThe workflow: you discuss the work with the assistant, it captures discrete units as GitHub issues or plan files, launches gremlins in the background to implement them, and lands each finished gremlin before starting dependent work. You stay at the strategic level — deciding what to build and in what order — while gremlins handle the implementation cycle unattended. The assistant maintains a queue of running, pending, and blocked work and surfaces it on request.\n\n---\n\n## Using gremlins across multiple repos\n\nWhen you run `gremlins launch`, the launcher captures the current working\ndirectory's repo root via `git rev-parse --show-toplevel` and stores it as\n`project_root` in the gremlin's `state.json`. That value pins the worktree\nbase, child process cwd, and pipeline discovery for that gremlin's lifetime.\n\n**To work on a different repo: `cd` there, then `gremlins launch`.** There is\nno `--project-root` flag; the cwd at launch time is the contract.\n\n**Fleet view** (`gremlins`) shows gremlins from all repos by default.\nPass `--here` to filter to the current repo's `project_root`.\n\n**Pipeline discovery** walks from the launching cwd, so `.gremlins/pipelines/`\noverrides in each repo apply to gremlins launched from that repo.\n\n**Queue caveat**: there is one global queue and the runner's cwd is frozen at\n`gremlins queue run --detach` time. To queue work against a different repo,\nprefix the command with `cd`:\n\n```sh\ngremlins queue add \"cd /path/to/other-repo \u0026\u0026 gremlins launch gh --plan '#42' --wait\"\ngremlins queue add \"cd /path/to/other-repo \u0026\u0026 gremlins land \u003cid\u003e\"\n```\n\n**State isolation**: each gremlin's state lives under its own directory\n(resolved via `platformdirs.user_state_dir(\"gremlins\")/\u003cid\u003e/`), so two repos\ncan have running gremlins simultaneously without interference.\n\n---\n\n## Runtime CLI prerequisites\n\n- `gh` — [GitHub CLI](https://github.com/cli/cli#installation)\n- `git` — [Git](https://git-scm.com/downloads) (pre-installed on most systems)\n- `claude` — [Claude Code CLI](https://docs.anthropic.com/en/docs/claude-code)\n\n## Dev install\n\n```sh\nuv venv\nsource .venv/bin/activate  # or `.venv\\Scripts\\activate` on Windows\nuv pip install -e \".[dev]\"\n```\n\n## Make targets\n\n| Target | What it runs |\n|---|---|\n| `make test` | `pytest` |\n| `make lint` | `ruff check .` |\n| `make format` | `ruff format --check .` (check only — does not rewrite files) |\n| `make typecheck` | `pyright` |\n| `make check` | lint + format + typecheck |\n\n## CLI subcommands\n\nInvoked as `python -m gremlins.cli \u003csubcommand\u003e` or `gremlins \u003csubcommand\u003e`\nafter install. The authoritative list and per-subcommand description lives in\nthe dispatch table in [`gremlins/cli/__init__.py`](gremlins/cli/__init__.py).\n\n| Subcommand | Purpose |\n|---|---|\n| `launch \u003cname\u003e` | Launch a background gremlin by pipeline name (`gremlins launch --list` to see available) |\n| `resume` | Re-spawn an existing gremlin from its recorded stage |\n| `stop` | Send SIGTERM to a running gremlin and wait for it to exit |\n| `land` | Land a finished gremlin onto the current branch |\n| `rm` | Delete a dead gremlin's state dir, worktree, and branch |\n| `close` | Mark a dead gremlin as closed |\n| `log` | Tail the gremlin's log file |\n| `ack` | Acknowledge a gremlin waiting for human input |\n| `skip` | Skip a gremlin waiting for human input |\n| `queue` | Manage the gremlin launch queue |\n| `prompt-for-assistant` | Print the assistant setup prompt to stdout |\n\n`_run-pipeline` is an internal spawn boundary; not for direct use.\n\n### `queue` sub-subcommands\n\n| Sub-subcommand | Description |\n|---|---|\n| `add [--run] \u003ccommand\u003e` | Add a command to the queue; `--run` also starts the runner if idle |\n| `list [--watch] [--json]` | List queued items |\n| `run [--once] [--poll-interval SEC] [--detach]` | Start the queue runner |\n| `requeue [--done]` | Move failed (and optionally done) items back to pending |\n| `clear [--failed\\|--done\\|--pending\\|--purge\\|--item STEM]` | Remove items from the queue |\n| `set-state \u003cstate\u003e --item STEM` | Manually transition a queue item to a different state |\n| `stop` | Stop the detached runner |\n\n### Launch flags\n\n#### Per-pipeline flags\n\nFlags vary by pipeline. The first stage's `__init__` signature defines the accepted flags; `gremlins launch \u003cname\u003e --help` prints the full list.\n\nCommon infrastructure flags (accepted by all pipelines):\n\n| Flag | Default | Description |\n|---|---|---|\n| `--plan \u003cpath-or-ref\u003e` | — | Path to a plan/spec file, or a GitHub issue ref (`42`, `#42`, `owner/repo#42`, or issue URL) |\n| `--description \u003ctext\u003e` | — | Human-readable description stored in state |\n| `--parent \u003cid\u003e` | — | Parent gremlin ID (used by boss to track child ownership) |\n| `--print-id` | false | Print the gremlin ID to stdout after launch |\n| `-c`/`--instructions \u003ctext\u003e` | — | Instructions string (mutually exclusive with `--plan`) |\n| `--base-ref \u003cref\u003e` | `HEAD` | Git ref to branch the worktree from; ignored for gh pipelines (always anchors to origin default branch). In parallel pipelines, automatically propagated to all child processes. |\n| `--spec \u003cpath\u003e` | — | Path to a coding-style spec file passed into stages |\n| `--bypass` | false | Skip permission checks; run in bypass mode |\n\n## Pipeline configuration\n\nGremlins runs a sequence of stages defined in a YAML file. The bundled\npipelines work out of the box; a project-local YAML can override any of them.\n\n### Discovery order\n\n`--pipeline \u003cname|path\u003e` resolves as follows:\n\n1. A value with a `.yaml` suffix or more than one path component is loaded\n   directly as a filesystem path.\n2. Otherwise `./.gremlins/pipelines/\u003cname\u003e.yaml` is checked first\n   (project-local override).\n3. Then `gremlins/pipelines/\u003cname\u003e.yaml` (bundled) is checked.\n\nThe pipeline name is the first non-flag argument to `gremlins launch`. Run `gremlins launch --list` to see all available pipeline names.\n\n### Selecting a pipeline\n\n```sh\ngremlins launch local   # bundled local.yaml\ngremlins launch gh      # bundled gh.yaml\n```\n\n### Schema reference\n\n**Top-level keys:**\n\n```yaml\nname: my-pipeline         # optional; defaults to the file stem\n\ndefault_client: claude:sonnet   # optional; provider:model string\n\nprompt_dir: ../prompts          # optional; relative to YAML, defaults to the YAML's directory\n\nstages:\n  - name: plan\n    type: plan\n    client: copilot:gpt-5.4     # optional; overrides default_client for this stage\n    prompt: gremlins:plan.md    # `gremlins:NAME` -\u003e bundled prompts; bare NAME -\u003e prompt_dir\n    options: {}\n```\n\n| Key | Description |\n|---|---|\n| `name` | Pipeline display name; defaults to the file stem |\n| `default_client` | `provider:model` string used for stages without an explicit `client:` |\n| `prompt_dir` | Directory that bare-name `prompt:` paths resolve against, relative to the YAML file. Defaults to the YAML's directory. |\n| `stages` | Ordered list of stage entries or parallel groups |\n\n**Per-stage keys:**\n\n| Key | Description |\n|---|---|\n| `name` | Unique stage identifier; used for `resume` targeting |\n| `type` | Registered stage type (see [Available stage types](#available-stage-types)) |\n| `client` | `provider:model` string; overrides `default_client` for this stage |\n| `prompt` | Path or list of paths. `gremlins:NAME` resolves from the bundled package prompts; a bare `NAME` resolves from the pipeline's `prompt_dir`. |\n| `options` | Free-form dict passed to the stage |\n\n**`provider:model` format:**\n\nProviders: `claude` (default), `copilot`, `openai`, `xai`, `anthropic`. The model part is optional — `claude:` and `claude:sonnet` are both valid. Examples: `claude:sonnet`, `copilot:gpt-5.4`, `openai:gpt-4o`. Per-stage `client:` in YAML takes precedence over the CLI `--client` flag; `default_client:` at the pipeline level does not.\n\n**Parallel-group form:**\n\n```yaml\n- name: reviews\n  parallel:\n    - name: review-detail\n      type: review-code\n      client: claude:sonnet\n    - name: review-security\n      type: review-code\n      client: claude:sonnet\n  max_concurrent: 2         # optional; defaults to all children at once\n```\n\n| Key | Description |\n|---|---|\n| `name` | Group identifier |\n| `parallel` | List of child stage entries (no nesting allowed) |\n| `max_concurrent` | Max simultaneously running children (optional) |\n\n### Client specifiers\n\nClients are specified as `provider:model` inline strings, either at the pipeline level (`default_client:`) or per stage (`client:`). The model part is optional.\n\n```yaml\ndefault_client: claude:sonnet     # all stages default to this\nstages:\n  - name: plan\n    type: plan\n  - name: implement\n    type: implement\n    client: copilot:gpt-5.4       # this stage uses copilot instead\n```\n\nProviders: `claude`, `copilot`, `openai`, `xai`, `anthropic`. The CLI `--client provider:model` flag overrides the pipeline-level `default_client:` but yields to per-stage `client:` settings.\n\n### `prompt:` field\n\n```yaml\nprompt: gremlins:plan.md                                  # single bundled file\nprompt: [gremlins:code_style.md, plan.md]                 # mix bundled and local; concatenated with \\n\\n\n```\n\nEach entry is one of:\n\n- `gremlins:NAME` — resolved from the bundled prompts shipped with the\n  package. Use this for prompts owned by gremlins (`code_style.md`,\n  `plan_gh.md`, etc.).\n- bare `NAME` — resolved from the pipeline's top-level `prompt_dir:`\n  (relative to the YAML file; defaults to the YAML's own directory). Use\n  this for prompts you author and check in alongside your pipeline.\n\nLists are joined with `\\n\\n` before being passed to the stage. There is\nno search fallback between the two — the prefix is the contract, so a\ncustom YAML reads as self-describing about which prompts come from the\npackage vs which must be provided locally.\n\nBy convention, project-local prompts live in `./.gremlins/prompts/` (a peer\nof `./.gremlins/pipelines/`, not nested under it) and pipelines set\n`prompt_dir: ../prompts`.\n\n### `options:` field\n\nA free-form dict passed verbatim to the stage. Selected options by stage\n(see [`gremlins/stages/AGENTS.md`](gremlins/stages/AGENTS.md) for the full list):\n\n**`verify`** — runs a list of shell commands with an agent fix-loop:\n\n```yaml\noptions:\n  cmds: [\"make check\", \"make test\"]  # commands to run (joined with \u0026\u0026)\n  max_attempts: 3                    # fix-loop retries (default: 3)\n```\n\nFor `local` stages, model options (`plan_model`, `impl_model`, `address_model`,\n`test_fix_model`, `detail`) can also be set here to override the CLI defaults.\n\n### Available stage types\n\n| Type | Description |\n|---|---|\n| `plan` | Produces an implementation plan |\n| `implement` | Applies the plan to the working tree |\n| `review-code` | Runs a code review and writes findings to disk |\n| `verify` | Runs check and test commands with an agent fix-loop |\n| `exec` | Runs shell commands with in:/out: artifact bindings |\n| `agent` | Resolves in: artifacts, renders prompt, invokes agent, verifies out: artifacts |\n| `handoff` | Runs the handoff agent once per boss loop iteration |\n| `loop` | Iterates body stages until a termination predicate or max iterations |\n| `sequence` | Runs body stages sequentially using child state |\n| `github-open-pull-request` | Opens a pull request on GitHub |\n| `github-request-copilot-review` | Requests a Copilot review on the open PR |\n| `github-wait-copilot` | Polls until Copilot posts its review |\n| `github-wait-ci` | Polls PR CI checks until they pass or exhaust attempts |\n\n### Parallel groups\n\nWrap sibling stages in a `parallel:` list to run them concurrently:\n\n```yaml\ndefault_client: claude:sonnet\n\nstages:\n  - name: plan\n    type: plan\n\n  - name: reviews\n    parallel:\n      - name: review-detail\n        type: review-code\n      - name: review-security\n        type: review-code\n    max_concurrent: 2\n\n  - name: address-code\n    type: agent\n```\n\n**Execution and failure:** The parallel group executes in three phases:\n1. **Fan-out** — each child stage starts independently as a subprocess\n2. **Concurrent execution** — all children run simultaneously (up to `max_concurrent`)\n3. **Fan-in** — all children finish or one bails; siblings continue running until group completion\n\nIf any child fails (raises `Bail`), the pipeline halts after the group finishes —\nsiblings are not cancelled mid-run by default. This can be changed with `cancel_on_bail: true`\nto cancel outstanding tasks immediately. The bail is evaluated via `bail_policy` (default: `any`,\nmeaning one failed child halts the group; set `bail_policy: all` to halt only when all children bail).\nSubsequent stages are skipped; the operator can resume or ack the group via CLI.\n\n**State isolation:** Each child gets its own state directory and subprocess. \nClient overrides, worktree paths, and artifact bindings are isolated per-child.\nChildren run in parallel without blocking each other. Parent `state.json` is updated\nduring the concurrent phase (e.g., `active_children` snapshot); copying child artifact\nbindings into the parent registry is deferred until fan-in completes.\n\n**Resume targeting:** Use the full child gremlin ID (form: `\u003cparent-id\u003e--\u003cgroup-name\u003e--\u003cchild-key\u003e`,\nvisible in fleet view) to resume a specific child. Resuming the parent group ID re-spawns all\nchildren that haven't landed.\n\n**Base ref propagation:** The `--base-ref` flag is automatically propagated from\nthe parent to all child processes, ensuring consistent branching across the group.\nChild worktrees are derived from the parent's base_ref as recorded in state.\n\n### Worked example: project-local override\n\nCreate `.gremlins/pipelines/local.yaml` to override the bundled `local`\npipeline. This example uses Opus for plan/implement/address stages and adds\na `verify` stage before `review-code`:\n\n```yaml\nname: local\n\nstages:\n  - { type: plan,         options: { plan_model: opus } }\n  - { type: implement,    options: { impl_model: opus } }\n  - { type: verify,       options: { cmds: [\"pytest\"] } }\n  - { type: review-code }\n  - { name: address-code, type: agent, options: { address_model: opus } }\n```\n\nAdd a `prompt:` key to any stage to supply a custom prompt; paths are\nrelative to the YAML file.\n\n### Worked example: parallel reviewers\n\nRun two `review-code` passes in parallel, then address both:\n\n```yaml\nname: local\n\ndefault_client: claude:sonnet\n\nstages:\n  - { type: plan }\n  - { type: implement }\n\n  - name: reviews\n    parallel:\n      - name: review-detail\n        type: review-code\n      - name: review-security\n        type: review-code\n    max_concurrent: 2\n\n  - { name: address-code, type: agent }\n```\n\nNote: `review-code` does not currently support per-stage prompt overrides\nvia YAML — both passes use the built-in detail lens.\n\n### Stage definitions\n\nYAML `stage-definitions:` lets you name and reuse stage patterns within a pipeline:\n\n```yaml\nstage-definitions:\n  review-base: \u0026review-base\n    type: review-code\n    client: claude:sonnet\n    prompt: gremlins:code_style.md\n\nstages:\n  - { type: plan }\n  - { type: implement }\n  - name: review-detail\n    \u003c\u003c: *review-base\n    prompt: [gremlins:code_style.md, detail_review.md]\n  - name: review-security\n    \u003c\u003c: *review-base\n    prompt: security_review.md\n```\n\nDefinitions provide base `type`, `options`, and `prompt`. Call-sites can override\n`prompt` and `options` via YAML anchors (as shown above) or via template placeholders\nin multi-stage recipes. Call-sites own the `name:`, `in:`, and `out:` keys;\n`out:` is forbidden inside a definition, but `in:` can be declared and will be\nmerged with call-site `in:` values. For single-stage definitions, only `name`, `in`,\nand `out` keys can be safely overridden; to vary `prompt` or `options`, use anchors.\n\n### Artifact binding\n\nStages can bind artifacts via `in:` and `out:` maps. These define what data\nflows between stages in the pipeline:\n\n```yaml\nstages:\n  - name: scan\n    type: exec\n    options:\n      cmds: [\"python scan.py \u003e $ARTIFACTS/report.json\"]\n    out:\n      report: file://session/report\n\n  - name: analyze\n    type: agent\n    in:\n      report: report\n    prompt: |\n      The scanning report is in {report}.\n      Propose fixes.\n```\n\n**Artifact URI schemes:**\n- `file://session/\u003cname\u003e` — Session artifact: a file created under the gremlin's `$ARTIFACTS` directory\n- `git://ref/\u003cref\u003e` — Git ref name (e.g., `git://ref/main` returns the string `main`)\n- `git://commit/\u003csha\u003e` — Commit SHA (e.g., `git://commit/abc123def` returns the full SHA)\n- `git://range/\u003cbase\u003e..\u003chead\u003e` — Commit range/log between two refs\n- `gh://pulls/\u003cnumber\u003e/head` — GitHub PR head ref (and other `gh://` schemes for GitHub data)\n- `file://`, `git://`, `gh://` — File artifact resolvers support these base schemes\n\n**Artifact binding semantics:**\n- `in:` values are registry key paths (e.g., `report` or `report.critical?default`) with optional dotted attribute access and `?default` fallback\n- `out:` values are URI strings that name what the stage produces; downstream stages reference the key name (not the URI) in their `in:` maps\n- Prompt/option substitution uses `{var}` tokens (not `{{var}}`); artifacts bound via `in:` become available for substitution\n- `in:` can be declared in a stage definition and will be merged with call-site `in:` values; `out:` cannot appear inside a definition\n\n### Stage definitions and bundled recipes\n\nSome stage types are not built-in — they are provided as bundled YAML recipes and must be wired in via `stage-definitions:` before use:\n\n```yaml\nstage-definitions:\n  github-push-to-pr-branch: gremlins:github_push_to_pr_branch\n\nstages:\n  - { name: push, type: github-push-to-pr-branch }\n```\n\n`gremlins:NAME` resolves the recipe from the bundled package (`gremlins/recipes/stages/NAME.yaml`). A bare path resolves relative to the pipeline file.\n\n### Bundled pipelines\n\nThe canonical reference pipelines:\n\n- [`gremlins/pipelines/local.yaml`](gremlins/pipelines/local.yaml) — `gremlins launch local`\n- [`gremlins/pipelines/gh.yaml`](gremlins/pipelines/gh.yaml) — `gremlins launch gh`\n- [`gremlins/pipelines/gh-terse.yaml`](gremlins/pipelines/gh-terse.yaml) — `gremlins launch gh-terse`\n- [`gremlins/pipelines/pr-extend.yaml`](gremlins/pipelines/pr-extend.yaml) — `gremlins launch pr-extend`\n- [`gremlins/pipelines/boss.yaml`](gremlins/pipelines/boss.yaml) — `gremlins launch boss`\n\n## Error handling and recovery\n\nGremlins can fail or get stuck during execution. Understanding how to recover is essential for running long-running pipelines.\n\n### Bail semantics\n\nWhen a stage detects an unrecoverable condition (e.g., a code review requests changes, secrets are detected, or a merge conflict blocks progress), it raises a `Bail` exception with a detail string.\n\nBy convention, agent-based stages emit a `BAIL: \u003cclass\u003e: \u003cdetail\u003e` marker at the end of their output. The `\u003cclass\u003e` token is conventionally one of:\n- `reviewer_requested_changes` — code review found issues that must be addressed\n- `security` — security review detected problems\n- `secrets` — credentials or sensitive data detected in the code\n- `other` — stage-specific or unknown failure condition\n\nThe bail detail is written to a per-attempt `bail_\u003cattempt\u003e.json` file in the gremlin's state directory and is visible in the fleet view. When a stage bails, the entire pipeline halts — subsequent stages do not run, but the gremlin's state is preserved for recovery.\n\n### Recovering from gremlin failures\n\nWhen a gremlin bails and halts, you have three recovery options:\n\n**`gremlins resume \u003cid\u003e`** — Re-spawn the bailed gremlin from the stage where it\nbailed. Use this when the cause has been fixed externally (e.g., a code review\nfix has been merged, or a merge conflict has been resolved). The gremlin will\nrestart from the bailed stage with the current worktree state.\n\n**`gremlins ack \u003cid\u003e`** — Acknowledge the gremlin without re-running. Use this\nwhen the bailed condition is acceptable (e.g., the review found minor style\nissues that don't block landing, or external work was already completed). The\ngremlin marks the bailed stage as complete and proceeds to subsequent stages.\n\n**`gremlins skip \u003cid\u003e`** — Create a new sibling attempt with the same parameters\nand a fresh ID, leaving the failed gremlin in place. Use this for transient\nfailures (timeouts, CI hangs) that won't self-resolve. Both attempts are visible\nin the fleet; the new attempt begins from the start.\n\n### Handling parallel group failures\n\nWhen a child in a parallel group bails:\n- The group halts after all currently-running children finish (not mid-run), unless `cancel_on_bail: true`\n- The bail reason is attributed to the child stage name\n- `gremlins resume \u003cparent-id\u003e` re-spawns all children that haven't landed\n- `gremlins resume \u003cparent-id\u003e--\u003cgroup-name\u003e--\u003cchild-key\u003e` resumes only that child (use the full child ID from fleet view)\n\nIf the cause was a transient failure affecting multiple children, `skip` the entire\ngroup and re-launch the pipeline to restart all children.\n\n### Boss-chain recovery\n\nWhen a boss gremlin spawns child gremlins (`gremlins launch ... --parent \u003cboss-id\u003e`),\nthe boss halts if a child bails. At this point:\n- The child's gremlin ID is visible in the fleet view as a child of the boss\n- Recover the child (`resume`, `ack`, or `skip`) independently\n- Once the child lands or is abandoned, resume the boss (`gremlins resume \u003cboss-id\u003e`)\n\nThe boss resumes from its child-spawn stage and proceeds with the next iteration\n(re-planning, re-implementing, or wrapping up, depending on the pipeline).\n\n## What can a gremlin do to my machine?\n\nGremlins operate in one of two permission modes:\n\n**Default mode** (no flags): The agent is restricted to an allowlist of tools\n(Read, Edit, Write, Bash, Grep, Glob) and its Bash commands are path-scoped to\nthe gremlin's git worktree.  It can read and modify files inside that worktree\nand blocks direct path references outside it.  This is a best-effort token\ncheck, not a full sandbox — indirect references (heredocs, computed paths) may\nnot be caught.\n\n**Bypass mode** (`--bypass`, `GREMLINS_BYPASS_PERMISSIONS=1`, project\n`.gremlins/permissions.yaml bypass_permissions: true`, or user config\n`~/.config/gremlins/config.toml bypass_permissions = true`): All permission\nchecks are disabled.  The agent can use any tool and reference any path.  Use\nthis when the task genuinely requires broader access (e.g. a pipeline that\nmodifies system config).\n\nThe three opt-in paths for bypass are:\n1. `gremlins launch \u003cpipeline\u003e --bypass` — single-launch override\n2. `GREMLINS_BYPASS_PERMISSIONS=1` in the environment\n3. `bypass_permissions: true` in `.gremlins/permissions.yaml` (project) or\n   `bypass_permissions = true` in `~/.config/gremlins/config.toml` (user)\n\n**Honest disclaimer**: The allowlist limits *reach* — what paths and tools the\nagent can invoke.  It does not limit *impact within reach*.  A gremlin with\nwrite access to your worktree can make any change inside it.  Review landed\ncommits before merging.\n\n**Backend differences**: On `openai:` and `xai:` backends, gremlins owns the\ntool layer and enforces the allowlist directly.  On the `anthropic:` backend,\nenforcement is coarser — the SDK loop uses vendor-defined tools and the path\nscoping is advisory.  On `claude:` and `copilot:` subprocess backends, the\ngremlins-layer permission block is **not** translated into CLI flags or\nsettings — the underlying CLI reads the operator's ambient config and\nenforces whatever the operator has configured there.  See \"Backend config\ninheritance\" below.\n\n### Backend config inheritance\n\nThe `claude:` backend is a thin wrapper around `claude -p`. It does *not*\nmaterialize a per-gremlin config dir, and it does *not* set\n`CLAUDE_CONFIG_DIR` for the subprocess. Whatever the operator has configured\nfor their interactive Claude session is exactly what the subprocess sees:\n\n- **Settings** — `~/.claude/settings.json` (plus any project-level\n  `.claude/settings.json` the CLI discovers) is read by the CLI directly.\n  The gremlins-layer `allowed_tools` / `disallowed_tools` block has no\n  effect on `claude:` runs; configure tool permissions via your own\n  Claude settings or use the `anthropic:` backend.\n\n  Gremlin worktrees — where the `claude:` subprocess does its file edits —\n  live under a stable, gremlins-scoped prefix in the system temp directory.\n  Discover it at runtime:\n\n  ```\n  python -c \"from gremlins import paths; print(paths.work_root())\"\n  ```\n\n  On Linux/macOS this is `/tmp/gremlins`; the OS reclaims orphaned\n  worktrees on reboot. A single `permissions.allow` rule in\n  `~/.claude/settings.json` covers every worktree path:\n\n  ```json\n  {\n    \"permissions\": {\n      \"allow\": [\n        \"Edit(\u003cwork_root\u003e/**)\",\n        \"Write(\u003cwork_root\u003e/**)\",\n        \"Read(\u003cwork_root\u003e/**)\"\n      ]\n    }\n  }\n  ```\n\n  Replace `\u003cwork_root\u003e` with the actual output of the command above.\n- **MCP servers and hooks** — inherited from the user's Claude config.\n- **Auth** — subscription auth follows `~/.claude/.credentials.json` (or the\n  macOS keychain) exactly as it would for an interactive session.\n- **Permission mode** — the only thing the wrapper still controls per call:\n  `--permission-mode bypassPermissions` when bypass is enabled, otherwise\n  `default`.\n\n#### True process isolation: use an SDK backend\n\nIf you need per-gremlin tool allow-lists, hermetic config, or a clean\nseparation between gremlins and your interactive Claude session, use one of\nthe SDK-backed providers instead:\n\n- `anthropic:\u003cmodel-id\u003e` — `claude-agent-sdk` with `setting_sources=[]` (no\n  ambient settings, no MCP, no hooks). Requires `ANTHROPIC_API_KEY`.\n  `allowed_tools` from the native block is enforced by the SDK.\n- `openai:\u003cmodel-id\u003e` / `xai:\u003cmodel-id\u003e` — `openai-agents` SDK with the\n  in-tree `GREMLINS_TOOLS` list. Per-gremlin `allowed_tools` filters that\n  list. Requires `OPENAI_API_KEY` / `XAI_API_KEY`.\n\nSet via pipeline YAML:\n\n```yaml\ndefault_client: anthropic:claude-sonnet-4-6\n# or per-stage:\nstages:\n  - name: implement\n    client: anthropic:claude-sonnet-4-6\n```\n\nSubscription auth is not available on the SDK backends — that is Anthropic\npolicy, not a gremlins limitation.\n\n### Local environment overrides\n\nIf `.gremlins/env` exists in the project root, gremlins sources it through\n`bash` at startup and merges any new or changed variables into the process\nenvironment before any stage runs. All subprocesses (plan, implement, verify,\nreview) inherit the result automatically.\n\n\u003e **Security warning:** because `.gremlins/env` is executed as a bash script,\n\u003e it can run arbitrary code. Do not run gremlins in a repository unless you\n\u003e have reviewed the contents of `.gremlins/env` and trust them.\n\nThe file is sourced via `bash`, so it can use command substitution,\nconditionals, and anything bash supports:\n\n```sh\nexport VIRTUAL_ENV=$(poetry env info --path)\nexport PATH=\"$VIRTUAL_ENV/bin:$PATH\"\nexport TEST_DATABASE_URL=postgresql://localhost/mydb_test\n```\n\nAdd `.gremlins/env` to your `~/.gitignore_global` or project `.gitignore`.\n\n### Loader API\n\n`gremlins/pipeline/loader.py` exposes:\n\n- `load_pipeline(path)` → `Pipeline` — parses a YAML file, resolves `clients`\n  via `CLIENT_FACTORIES`, and validates every stage `type` against\n  `STAGE_REGISTRY` (populated by importing `gremlins.stages.all`).\n- `resolve_pipeline_path(name_or_path, base_dir)` — resolves a name or path\n  using the discovery order above.\n\nDataclasses: `Pipeline`, `StageEntry` (parallel groups have `type=\"parallel\"`\ninternally and carry a `children` list and optional `max_concurrent`).\n\n## Internals docs\n\n- [`gremlins/AGENTS.md`](gremlins/AGENTS.md) — module layout, entry points,\n  testability seam, byte-stable strings\n- [`gremlins/fleet/AGENTS.md`](gremlins/fleet/AGENTS.md) — fleet manager internals\n- [`gremlins/orchestrators/AGENTS.md`](gremlins/orchestrators/AGENTS.md) — orchestrator internals\n- [`gremlins/stages/AGENTS.md`](gremlins/stages/AGENTS.md) — stage internals\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxbrianh%2Fgremlins","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fxbrianh%2Fgremlins","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxbrianh%2Fgremlins/lists"}