https://github.com/existential-birds/daydream

Automated code review and fix agent loop using Claude Agent SDK or Codex with Beagle skills
https://github.com/existential-birds/daydream
agentic-ai ai-agents automation claude cli code-review codex elixir golang phoenix python react-router-v7
Last synced: about 2 months ago
JSON representation
Automated code review and fix agent loop using Claude Agent SDK or Codex with Beagle skills
Host: GitHub
URL: https://github.com/existential-birds/daydream
Owner: existential-birds
License: apache-2.0
Created: 2026-02-01T06:30:58.000Z (5 months ago)
Default Branch: main
Last Pushed: 2026-03-03T14:18:40.000Z (4 months ago)
Last Synced: 2026-03-03T18:32:40.494Z (4 months ago)
Topics: agentic-ai, ai-agents, automation, claude, cli, code-review, codex, elixir, golang, phoenix, python, react-router-v7
Language: Python
Homepage: https://existentialbirds.com
Size: 432 KB
Stars: 2
Watchers: 0
Forks: 1
Open Issues: 5
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
- Codeowners: .github/CODEOWNERS
Awesome Lists containing this project

README

          # daydream

[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/existential-birds/daydream)

Automated code review and fix loop powered by Claude and Codex.

Daydream launches review agents equipped with [Beagle](https://github.com/existential-birds/beagle) skills — specialized knowledge modules for your technology stack (FastAPI, React, Phoenix, and more). It parses actionable feedback from the review, applies fixes automatically, and validates changes by running your test suite.

![demo](https://github.com/user-attachments/assets/60a80645-36de-410e-afa7-7a96efef3f57)

## Features

- **Deep review by default**: Multi-stack parallel pipeline that combines stack-agnostic intent analysis with per-stack Beagle reviews and cross-stack merge

- **Single-stack mode**: Use `--shallow` for a single Beagle skill review-fix-test loop

- **Comment mode**: `--comment` posts inline review findings on the open PR; `--review` writes a report to terminal/markdown

- **Stack-aware reviews**: Beagle skills load framework-specific knowledge (FastAPI patterns, React hooks, Phoenix lifecycle, etc.) as the reviewer encounters relevant code

- **Codebase exploration**: Tree-sitter-powered pre-scan resolves imports and detects conventions to ground reviews in actual codebase context

- **Intelligent parsing**: Extracts actionable issues from review output, skipping positive observations

- **Automated fixes**: Applies fixes one-by-one with minimal changes

- **PR feedback subcommand**: `daydream feedback ` fetches bot review comments, fixes them sequentially, and responds automatically

- **Multi-backend support**: Claude (default) or OpenAI Codex, with per-phase backend overrides

- **Parallel execution**: Up to 4 concurrent fix agents with live progress tracking

- **Test validation**: Runs your test suite and offers interactive retry/fix options on failure

- **Commit integration**: Optionally commits and pushes changes when complete

- **ATIF v1.6 trajectory recording**: Every run produces a machine-parseable trajectory with automatic secret redaction

- **Run archive**: Automatic archival to `~/.daydream/archive/` with SQLite index for cross-project querying

- **Post-run evaluation**: Deterministic analysis of cost, grounding rate, file coverage, and finding quality

- **Outcome labeling**: Tag archived runs as accepted/rejected/mixed for SFT/RL training datasets

## Prerequisites

- Python 3.12+

- [uv](https://docs.astral.sh/uv/) package manager

- [Claude Code](https://claude.ai/code) CLI

- [Codex CLI](https://openai.com/codex) — required when using `--backend codex`

- [GitHub CLI](https://cli.github.com/) (`gh`) — required for the `feedback` subcommand and `--comment` mode

### Install Beagle

Daydream requires the [Beagle](https://github.com/existential-birds/beagle) plugin for Claude Code.

```bash

claude plugin marketplace add https://github.com/existential-birds/beagle

claude plugin install beagle

```

Verify by running `/beagle:` in Claude Code — you should see the command list.

## Installation

```bash

git clone https://github.com/existential-birds/daydream.git

cd daydream

uv sync

```

## Updating

```bash

cd daydream

git pull

uv sync

```

## Usage

```bash

# Default: deep multi-stack review-fix-test loop

daydream /path/to/project

# Single-stack loop (skip multi-stack auto-detection)

daydream --shallow /path/to/project

# Force a specific Beagle skill instead of auto-detecting

daydream -s python /path/to/project

# Review and post inline PR comments, then exit (no fix, no test)

daydream --comment --branch feat/my-feature /path/to/project

# Review and write a report to terminal/markdown, then exit

daydream --review /path/to/project

# Use Codex backend instead of Claude

daydream --backend codex /path/to/project

# Fix bot review comments on a PR

daydream feedback 42 --bot "coderabbitai[bot]" /path/to/project

# Label an archived run for training datasets

daydream label abc12345 --accepted

```

### More Examples

```bash

# Resume from a specific phase

daydream --shallow --start-at fix /path/to/project

# Resume deep review from the per-stack stage

daydream --start-at per-stack /path/to/project

# Mix backends per phase

daydream --backend codex --fix-backend claude /path/to/project

# Override models per phase (defaults come from the per-backend table)

daydream --review-model claude-opus-4-6 --parse-model claude-haiku-4-5 /path/to/project

# Force ephemeral worktree even without --branch

daydream --worktree /path/to/project

# Generate an implementation plan and embed it in PR comments

daydream --comment --plan --branch feat/my-feature /path/to/project

# Exclude paths from the diff

daydream --ignore-path .planning --ignore-path vendor /path/to/project

# Run with evaluation analysis, skip archival

daydream --eval --no-archive /path/to/project

# Write trajectory to a custom path

daydream --trajectory /tmp/run.json /path/to/project

```

### Command Line Options

| Option | Description | Default |

|--------|-------------|---------|

| `TARGET` | Target directory | Prompt interactively |

| `-s, --skill` | Force review skill: `python`, `react`, `elixir`, `go`, `rust`, `ios` | Auto-detect from changed files |

| `-b, --backend` | Agent backend: `claude` or `codex` | `claude` |

| `--review-backend` | Override backend for the review phase | `--backend` value |

| `--fix-backend` | Override backend for the fix phase | `--backend` value |

| `--test-backend` | Override backend for the test phase | `--backend` value |

| `--review-model` | Override model for the REVIEW phase (default: per-backend table; see README). | Per-backend table (see below) |

| `--parse-model` | Override model for the PARSE phase (default: per-backend table; see README). | Per-backend table (see below) |

| `--fix-model` | Override model for the FIX phase (default: per-backend table; see README). | Per-backend table (see below) |

| `--test-model` | Override model for the TEST phase (default: per-backend table; see README). | Per-backend table (see below) |

| `--exploration-model` | Model for exploration subagents. Use a smaller model to save cost. | Per-backend table (see below) |

| `--comment` | Review and post inline PR comments, then exit | |

| `--review` | Review and write a report to terminal/markdown, then exit | |

| `--shallow` | Single-stack review (skip multi-stack auto-detection) | |

| `--branch BRANCH` | Branch to review | cwd's local HEAD |

| `--base BASE` | Base ref to compare against | PR base if any, else `origin/HEAD` |

| `--worktree` | Force ephemeral worktree even when `--branch` is omitted | |

| `--copy PATH` | Extra path to copy into ephemeral worktree (repeatable) | |

| `--plan` | Generate implementation plan and embed in PR comments (with `--comment`) | |

| `--start-at` | Start at phase: `review`, `parse`\*, `fix`, `test`\*, `ttt`†, `per-stack`†, `merge`† | `review` |

| `--loop` | Repeat review-fix-test cycle until zero issues or max iterations | |

| `--max-iterations N` | Maximum loop iterations (only meaningful with `--loop`) | `5` |

| `--trajectory ` | Write trajectory to custom path | `/.daydream/runs//trajectory.json` |

| `--no-archive` | Disable automatic archival to `~/.daydream/archive/` | |

| `--eval` | Run deterministic evaluation analysis and store in archive | |

| `--ignore-path PATH` | Exclude path from diff (repeatable) | |

| `--cleanup` | Remove `.review-output.md` after completion | |

| `--no-cleanup` | Keep `.review-output.md` after completion | |

\* `parse` and `test` are valid only with `--shallow`.

† `ttt`, `per-stack`, `merge` are valid only in deep (default) mode.

### Per-phase model defaults

Each phase resolves its model in this order: explicit per-phase flag → per-backend phase default (below) → backend default. Phases without an override flag (WONDER, ENVISION, MERGE, INTENT, PR_FEEDBACK) still resolve through the per-backend table — they get phase-appropriate models without a knob.

**Claude backend:**

| Phase | Default model |

|-------|---------------|

| `parse` | `claude-haiku-4-5` |

| `fix` | `claude-sonnet-4-6` |

| `test` | `claude-sonnet-4-6` |

| `exploration` | `claude-sonnet-4-6` |

| `review` | `claude-opus-4-6` |

| `wonder` | `claude-opus-4-6` |

| `envision` | `claude-opus-4-6` |

| `merge` | `claude-opus-4-6` |

| `intent` | `claude-opus-4-6` |

| `pr_feedback` | `claude-opus-4-6` |

**Codex backend:**

| Phase | Default model |

|-------|---------------|

| `parse` | `gpt-5.5` |

| `fix` | `gpt-5.5` |

| `test` | `gpt-5.5` |

| `exploration` | `gpt-5.5` |

| `review` | `gpt-5.5` |

| `wonder` | `gpt-5.5` |

| `envision` | `gpt-5.5` |

| `merge` | `gpt-5.5` |

| `intent` | `gpt-5.5` |

| `pr_feedback` | `gpt-5.5` |

The codex table uses `gpt-5.5` for every phase in v1; per-phase tiering for codex is deferred to a future release once concrete model picks across the codex lineup are settled.

### Subcommands

#### `daydream feedback`

Fetch bot review comments from a GitHub PR, fix them, and respond.

```bash

daydream feedback  --bot  [TARGET]

```

| Argument | Description |

|----------|-------------|

| `pr#` | Pull request number to process |

| `--bot BOT_NAME` | Bot username to filter PR comments (e.g. `coderabbitai[bot]`) |

| `TARGET` | Target directory (default: current directory) |

#### `daydream label`

Update outcome labels on an archived run for SFT/RL training datasets.

```bash

daydream label  --accepted|--rejected|--mixed

```

| Argument | Description |

|----------|-------------|

| `session_id` | Session ID (full UUID or prefix) |

| `--accepted` | Label run as accepted |

| `--rejected` | Label run as rejected |

| `--mixed` | Label run as mixed |

## How It Works

Daydream has four output modes selected by flag combinations:

- **Default loop** (no flag): full review → fix → test cycle

- **`--comment`**: review and post inline PR comments, then exit

- **`--review`**: review and write a report, then exit

- **`daydream feedback `**: ingest bot PR comments and fix them

Within the loop mode, `--shallow` opts into a single-stack review-fix-test pipeline; the default is the deep multi-stack pipeline. All modes support multiple backends (Claude and Codex), including per-phase overrides.

### Deep Review (default)

A five-stage pipeline that combines stack-agnostic intent analysis with parallel per-stack Beagle reviews and a cross-stack merge. Use `--start-at` to resume from a specific stage.

1. **Exploration pre-scan**: Tree-sitter import resolution and convention detection (auto-skipped for trivial diffs)

2. **Intent (`ttt`)**: Understands the git diff and commit history to build context

3. **Alternative review**: Identifies potential improvements as numbered issues

4. **Per-stack reviews**: Parallel Beagle skill invocations, one per detected stack (Python, TypeScript, Go, etc.)

5. **Cross-stack merge**: Synthesizes per-stack findings into a unified review with deduplication

After the merge, an optional fix gate offers test/commit/push phases.

### Shallow Review (`--shallow`)

A single-skill four-phase workflow. Use `--start-at` to resume from a specific phase.

#### Phase 1: Review

Invokes the resolved Beagle review skill (e.g., `beagle-python:review-python`). The review output is written to `.review-output.md` in the project root.

#### Phase 2: Parse Feedback

Extracts actionable issues from the review output as structured JSON. Positive observations and summary sections are filtered out.

**Note:** `--start-at parse` requires an existing `.review-output.md` file.

#### Phase 3: Apply Fixes

For each actionable issue:

1. Displays the issue description, file, and line number

2. Prompts the agent to apply the minimal fix needed

3. Shows progress as fixes are applied

**Note:** `--start-at fix` requires an existing `.review-output.md` file.

#### Phase 4: Test and Heal

Runs your project's test suite. On failure, offers interactive options:

- **Retry tests**: Run again without fixes

- **Fix and retry**: Launch the agent to analyze and fix failures

- **Ignore**: Mark as passed and continue

- **Abort**: Exit with failure status

After tests pass, optionally commit and push changes.

**Note:** `--start-at test` skips all other phases and runs tests directly.

### Comment / Review Modes

`--comment` and `--review` run a three-phase conversational review:

1. **Understand intent**: Explores the git diff and commit history to build context

2. **Evaluate alternatives**: Reviews the changes and identifies potential improvements as numbered issues

3. **Generate plan** (optional): With `--plan`, writes an implementation plan to `.daydream/plan-{timestamp}.md`

`--comment` then posts the issues as inline review comments on the open PR for the current branch (use `--branch` to point at a specific branch). `--review` writes the report to terminal/markdown and exits without posting.

### PR Feedback Subcommand

`daydream feedback  --bot ` resolves bot review comments on a PR:

1. **Fetch**: Pulls bot comments from the PR via the `fetch-pr-feedback` Beagle skill

2. **Parse**: Extracts actionable issues (reuses the shallow-mode parser)

3. **Fix**: Applies fixes sequentially, one issue at a time

4. **Commit & Push**: Automatically commits and pushes all changes

5. **Respond**: Posts fix results back on the PR as comment replies

## Output Files

| File | Description |

|------|-------------|

| `.review-output.md` | Review results (removed with `--cleanup`, required for `--start-at parse/fix`) |

| `.daydream/runs//trajectory.json` | ATIF v1.6 trajectory (customize path with `--trajectory`) |

| `.daydream/trajectories/` | Forked sub-trajectories from parallel fan-outs (fix-parallel, deep, exploration) |

| `.daydream/diff.patch` | Unified git diff captured at run start |

| `.daydream/plan-{timestamp}.md` | Implementation plan (created by `--comment --plan` and `--review`) |

| `.daydream/deep/` | Deep pipeline artifacts: intent, per-stack reviews, merged report (default mode) |

### Archive

Unless `--no-archive` is passed, each run is automatically archived to `~/.daydream/archive/runs/{session_id}/` with:

| File | Description |

|------|-------------|

| `manifest.json` | Run metadata, git context, backend config, token/cost metrics |

| `trajectory.json` | Copy of the run trajectory |

| `review-output.md` | Copy of review findings |

| `evaluation.json` | Deterministic evaluation results (only with `--eval`) |

| `deep/` | Deep artifacts copy (default mode) |

| `diff.patch` | Diff copy |

A SQLite index at `~/.daydream/archive/index.db` enables cross-project querying by repo, backend, cost, grounding rate, and outcome labels.

## Trajectory Output

Every daydream run produces an [ATIF v1.6](https://www.harborframework.com/docs/agents/trajectory-format) trajectory file at `/.daydream/runs//trajectory.json`. The trajectory captures the full agent interaction history — prompts, responses, tool calls, observations, and per-step token/cost metrics. Use `--trajectory ` to write to a custom location.

Sensitive content is automatically redacted before writing: API keys (`sk-*`, `ghp_*`, `xoxb-*`, `AKIA*`), JWT tokens, URL credentials, username segments in file paths, and `.env`-style secret values are replaced with type-specific `[REDACTED_*]` tokens. Interrupted runs (SIGINT/SIGTERM) flush a `.partial` file with `extra.partial=true` so consumers can detect incomplete trajectories.

**Consumer integration:**

- Validate trajectories with [Harbor](https://github.com/laude-institute/harbor)'s trajectory validator

- Replay in any ATIF-compatible viewer

- Use as training data for SFT/RL pipelines (trajectories are machine-parseable by design)

- Label outcomes with `daydream label` for supervised fine-tuning datasets

## License

Apache License 2.0. See [LICENSE](LICENSE) for details.
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/existential-birds/daydream

Awesome Lists containing this project

README