https://github.com/tibtof/lgtm-buzzer

Browser extension that quizzes you on the diff before letting you approve a PR. Powered by your local LLM CLI.
https://github.com/tibtof/lgtm-buzzer
azure-devops browser-extension chrome-extension claude code-review developer-tools github llm typescript
Last synced: 25 days ago
JSON representation
Browser extension that quizzes you on the diff before letting you approve a PR. Powered by your local LLM CLI.
Host: GitHub
URL: https://github.com/tibtof/lgtm-buzzer
Owner: tibtof
License: mit
Created: 2026-05-21T17:05:50.000Z (2 months ago)
Default Branch: main
Last Pushed: 2026-06-04T21:35:18.000Z (about 2 months ago)
Last Synced: 2026-06-04T22:17:54.804Z (about 2 months ago)
Topics: azure-devops, browser-extension, chrome-extension, claude, code-review, developer-tools, github, llm, typescript
Language: TypeScript
Homepage:
Size: 1.05 MB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 3
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

          # LGTM-Buzzer

**Gate your PR approvals behind a quiz on the actual diff.**

[![CI](https://github.com/tibtof/lgtm-buzzer/actions/workflows/ci.yml/badge.svg)](https://github.com/tibtof/lgtm-buzzer/actions/workflows/ci.yml)

[![Release](https://img.shields.io/github/v/release/tibtof/lgtm-buzzer)](https://github.com/tibtof/lgtm-buzzer/releases)

[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](./LICENSE)

---

## What this is

LGTM-Buzzer is a Chrome extension that intercepts the Approve button on GitHub

pull requests and gates it behind a short quiz generated from the actual diff.

If you can answer the quiz, the approval goes through. If you can't, you didn't

read the PR.

**The quiz is always generated from the raw diff bytes — never from the PR

title, description, commit messages, labels, or comments.** A teammate writing

a great PR description cannot short-circuit the gate. This is the core

invariant of the project, enforced at six layers from the VCS adapter through

the wire protocol to the LLM prompt.

**All LLM calls stay local.** The extension never contacts an LLM directly.

The native messaging host shells out to whichever CLI or API you already have

configured on your machine. No credentials live in the extension, no diff bytes

leave your machine through a third-party proxy, no telemetry of any kind.

Four LLM adapters are available: Claude Code CLI, Codex CLI, `gh copilot`, and

the Anthropic API (host-held key). Two VCS adapters are available: GitHub

(fully functional) and Azure DevOps (UI interception works; the multi-call diff

adapter is deferred to the next milestone — see the status table below).

---

## Status: v0.1.0 — M3 release

| Area | Status | Notes |

|---|---|---|

| Chrome MV3 extension | Working | Approve-button interception on `github.com` PR pages |

| Quiz modal | Working | Questions, answers, pass/fail, error states, retry, WCAG AA |

| `claude-cli` adapter | Working | Shells out to the `claude` binary |

| `codex-cli` adapter | Working | Shells out to the `codex` binary |

| `copilot-cli` adapter | Working | Shells out to `gh copilot` |

| `claude-api` adapter | Working | Anthropic REST API with prompt caching |

| GitHub VCS adapter | Working | PAT-authenticated diff fetch from the GitHub API |

| ADO VCS adapter | Stubbed | Approve button intercepted on `dev.azure.com`; the diff-fetching adapter (multi-call ADO API) is deferred to v0.2 |

| Options page | Working | Runtime LLM + VCS adapter selection, credential storage |

| Native messaging host | Working | macOS and Linux; Windows deferred |

| GitHub Actions CI | Working | `npm run check` on every push and PR |

| Release packaging | Working | Extension zip + host tarball with checksums |

| Playwright e2e | Working | Happy-path quiz gate in CI (xvfb-run on Linux) |

| promptfoo evals | Working | Quiz-quality eval suite across all four adapters |

| Safari port | Deferred | Post-v1.0 via Xcode MV3 converter |

| Firefox MV3 | Deferred | Future milestone |

| OS keychain integration | Deferred | Credentials currently stored as plaintext |

---

## Screenshots

Screenshots TBD — no screenshots have been captured yet for v0.1.0. A short

screen recording of the quiz gate in action will be added before the v0.1.0

tag is published.

---

## Quick start

### Prerequisites

| Requirement | Minimum | Notes |

|---|---|---|

| Node.js | 22 LTS | `node --version` to check |

| Chrome | any recent stable | Developer mode required |

| At least one LLM | — | See table below |

| GitHub PAT | — | `Contents: read` scope (or `repo` scope for classic tokens) |

**LLM prerequisites — pick at least one:**

| Adapter | What you need |

|---|---|

| `claude-cli` | `claude` CLI installed and authenticated |

| `codex-cli` | `codex` CLI installed and authenticated |

| `copilot-cli` | `gh` CLI with `gh copilot` extension, authenticated via `gh auth login` |

| `claude-api` | `ANTHROPIC_API_KEY` environment variable set |

### Install steps

```bash

# 1. Clone and install

git clone https://github.com/tibtof/lgtm-buzzer.git

cd lgtm-buzzer

npm install

# 2. Build everything

npm run build

# 3. Install the native-messaging manifest (macOS / Linux)

node packages/host/dist/install-manifest.js

# Re-run with your extension ID after Step 5:

# LGTM_BUZZER_EXTENSION_ID= node packages/host/dist/install-manifest.js

# 4. Load the extension in Chrome

#    chrome://extensions → Developer mode → Load unpacked

#    → packages/extension/.output/chrome-mv3/

# 5. Open the extension options page, pick your LLM + VCS adapter,

#    and enter credentials (GitHub PAT, or Anthropic API key if using claude-api).

```

See **[docs/getting-started.md](docs/getting-started.md)** for the full

step-by-step walkthrough, including the extension ID lookup, troubleshooting,

and a detailed description of each step.

### Downloading a pre-built release

Pre-built artifacts are on the

[GitHub Releases page](https://github.com/tibtof/lgtm-buzzer/releases):

- `lgtm-buzzer-extension-v.zip` — Chrome MV3 extension (load unpacked or submit to the Web Store).

- `lgtm-buzzer-host-v.tar.gz` — Native messaging host with installer; no `npm install` needed.

See **[docs/release.md](docs/release.md)** for the maintainer release guide.

---

## How to use

Once installed and configured:

1. Navigate to a GitHub pull request.

2. Click the **Approve** button (or go through **Review changes → Approve →

   Submit review**).

3. A quiz modal appears in place of the usual confirmation.

4. Read the questions — they are generated from the PR diff, not the description.

5. Type your answers and click **Submit**.

6. **Pass**: your approval is submitted. **Fail**: close the modal and re-read the diff.

The quiz is generated fresh for each approval attempt. There is no "skip" path.

---

## Configuration

Open the extension options page by clicking the LGTM-Buzzer icon in Chrome's

toolbar and selecting **Options** (or navigating to

`chrome-extension:///options.html`).

On the options page you can:

- Select your preferred LLM adapter (Claude CLI, Codex CLI, Copilot CLI, or Claude API).

- Select your VCS adapter (GitHub or ADO — ADO diff fetch is stubbed in v0.1.0).

- Enter adapter credentials (GitHub PAT, Anthropic API key).

Settings are saved immediately to `chrome.storage.local`. The storage schema

is validated with Zod on every read; corrupt storage falls back to defaults

with a visible warning.

---

## Security

### Credential storage

Credentials (GitHub PAT, Anthropic API key) are stored in **`chrome.storage.local` as plaintext**. This is a v1 limitation. We do not yet integrate with OS keychains (macOS Keychain, Linux SecretService). A future ADR will track this upgrade — the `StorageArea` port in the extension is the designed injection point.

### Diff-only invariant

PRs are quizzed on the diff bytes only — never on the PR title, description,

commit messages, labels, or comments. This is enforced at six layers:

1. The VCS port (`VCSProvider`) accepts only a PR identifier and returns a raw diff string. No other PR metadata is in the type.

2. The GitHub adapter fetches only the `application/vnd.github.diff` media type from the GitHub API.

3. The wire-format `quiz-request` message schema (ADR-7, ADR-11) carries only `prId` and `diff` — no title or description fields exist in the schema.

4. The `QuizSession` aggregate (ADR-14) receives only the diff from the wire message; it never sees PR metadata.

5. Each LLM adapter prompt template is diff-in / structured-JSON-out with no slot for PR metadata.

6. The promptfoo eval suite includes a negative-control fixture (`docs-readme-update`) that asserts adapters return an error rather than a quiz when fed a docs-only change with no code symbols.

Any change that adds non-diff PR text to any of these layers is treated as a security boundary violation and requires a new ADR.

### LLM calls

- **CLI adapters** (`claude-cli`, `codex-cli`, `copilot-cli`): the host spawns a subprocess. The diff bytes go in on stdin. No network egress from the extension or the host beyond the subprocess's own network activity.

- **API adapter** (`claude-api`): the host calls the Anthropic REST API directly with the API key you configured. No third-party proxy, no telemetry.

- The extension itself makes no LLM calls and holds no LLM credentials.

---

## Architecture

LGTM-Buzzer uses hexagonal architecture enforced by npm workspace boundaries.

### System diagram

The extension lives in the browser; everything that touches an LLM or a VCS

API lives in a Node host process on the user's machine. The two sides only

ever talk over Chrome's native-messaging stdio bridge, and every frame is

validated by a Zod schema from `packages/protocol`.

```mermaid

flowchart TB

    subgraph Browser["Browser (Chrome MV3) — packages/extension"]

        direction TB

        CS["Content Script
intercepts Approve click
mounts Quiz Modal"]

        SW["Service Worker
owns native-messaging port
zod-validates host replies"]

        CS <--> SW

    end

    subgraph Machine["User's machine — Node host process"]

        direction TB

        Host["Native Host (packages/host)
stdin read-loop · zod-validate · dispatch"]

        subgraph Core["Core domain (packages/core) — pure, zero I/O"]

            direction TB

            Domain["QuizSession · ReviewGate"]

            Ports["Ports
LLMProvider · VCSProvider · QuizPolicy"]

            Domain --- Ports

        end

        subgraph Adapters["Adapters (packages/adapters)"]

            direction LR

            subgraph VCSGroup["VCS"]

                direction TB

                GH["github"]

                ADO["ado"]

            end

            subgraph LLMGroup["LLM"]

                direction TB

                CLAUDE["claude-cli"]

                CODEX["codex-cli"]

                COPILOT["copilot-cli"]

                API["claude-api"]

            end

        end

        Host -->|"drives"| Core

        Ports -.->|"implemented by"| Adapters

    end

    subgraph External["External services and processes"]

        direction TB

        GHAPI["GitHub REST API"]

        ADOAPI["Azure DevOps API"]

        LocalLLMs["Local LLM CLIs
claude · codex · gh copilot"]

        Anthropic["Anthropic API"]

    end

    SW <==>|"Native Messaging stdio
uint32 length-prefixed JSON · schemas in packages/protocol"| Host

    GH --> GHAPI

    ADO --> ADOAPI

    CLAUDE --> LocalLLMs

    CODEX --> LocalLLMs

    COPILOT --> LocalLLMs

    API --> Anthropic

```

**In one breath:** the content script and service worker are the only pieces

that run in the browser. The service worker speaks a length-prefixed JSON

protocol over stdio to the native host. Inside the host, the core domain is

pure and exposes ports (`LLMProvider`, `VCSProvider`, `QuizPolicy`); adapters

implement those ports and are the only place that touches the network, a

subprocess, or any other external service. The extension cannot reach an

LLM directly — there is no arrow from the browser box to any external

service. Adapter availability is tracked in the

[LLM + VCS adapter matrix](#llm--vcs-adapter-matrix) below; this diagram

shows architectural structure, not per-adapter shipping status.

### Request flow — one approval attempt

This sequence walks one Approve click from the moment the content script

intercepts it through quiz generation, scoring, and the eventual pass-or-

keep-the-modal decision. Wire-frame names (`quiz-request`, `quiz-response`,

`quiz-submit`, `quiz-result`) match the schemas in `packages/protocol` per

ADR-13.

```mermaid

sequenceDiagram

    autonumber

    actor User

    participant CS as Content Script

    participant SW as Service Worker

    participant Host as Native Host

    participant Core as Core (QuizSession / ReviewGate)

    participant VCS as VCS Adapter

    participant LLM as LLM Adapter

    participant Ext as External (Git provider / LLM)

    User->>CS: clicks Approve

    CS->>SW: quiz-request { pr, questionCount }

    SW->>Host: stdio frame

    Note over Host: zod-validates every
incoming frame

    Host->>Core: QuizSession.start

    Core->>VCS: getDiff(pr)

    VCS->>Ext: HTTPS diff fetch (provider-specific)

    Ext-->>VCS: raw diff bytes

    VCS-->>Core: diff

    Note over VCS,LLM: Only the raw diff crosses this boundary.
No PR title, description, or comments.

    Core->>LLM: generateQuiz(diff)

    Note right of LLM: CLI adapters: spawnIO,
stdin = diff.
API adapter: HTTPS.

    LLM->>Ext: spawn CLI or HTTPS

    Ext-->>LLM: questions JSON

    LLM-->>Core: Quiz

    Core-->>Host: Quiz

    Host-->>SW: quiz-response { quiz }

    SW-->>CS: Quiz

    CS->>User: render Quiz Modal

    User->>CS: submit answers

    CS->>SW: quiz-submit { quizId, answers }

    SW->>Host: stdio frame

    Host->>Core: ReviewGate.grade(answers)

    Core-->>Host: pass / fail

    Host-->>SW: quiz-result { passed, perQuestion? }

    SW-->>CS: result

    alt pass

        CS->>CS: re-fire original Approve click

    else fail

        CS->>User: modal stays, Approve remains gated

    end

```

**In one breath:** Approve clicks never go straight through. The extension

asks the host for a quiz over a `quiz-request` frame; the host's core asks

the VCS adapter for the diff, hands that diff (and only the diff) to the

LLM adapter, and returns the questions as `quiz-response`. The user

answers in the modal; the extension forwards a `quiz-submit` frame; the

host's `ReviewGate` scores it and replies with `quiz-result`. On pass, the

content script re-fires the original Approve click; on fail, the modal

stays put and the gate holds.

### Workspaces

```

packages/

  protocol/    Wire-format schemas (zod) and domain DTOs. Zero runtime deps except zod.

  core/        Pure domain logic: ports, QuizSession, ReviewGate. No Node, no DOM, no I/O.

  adapters/    Concrete port implementations (claude-cli, codex-cli, copilot-cli,

               claude-api, github, ado). One subfolder per adapter.

  host/        Native messaging host. Node-only wiring of adapters into core.

  extension/   Chrome MV3 service worker, content scripts, options page, quiz modal.

  evals/       promptfoo eval suite for quiz quality.

```

Dependency direction is strict and enforced by ESLint:

```

protocol  <- core  <- adapters  <- host

protocol  <- core  <- extension

```

See [`CLAUDE.md`](./CLAUDE.md) for the full project constitution (architecture

principles, dependency rules, FP idioms, code style) and

[`decisions.md`](./decisions.md) for the full architecture decision log (28 ADRs

covering every significant design choice from the FP foundation through the

release pipeline).

---

## LLM + VCS adapter matrix

| Adapter | Type | Status | Credentials | Notes |

|---|---|---|---|---|

| `claude-cli` | LLM | Working | none (CLI login) | requires `claude` binary on PATH |

| `codex-cli` | LLM | Working | none (CLI login) | requires `codex` binary on PATH |

| `copilot-cli` | LLM | Working | none (`gh auth login`) | requires `gh` + `gh-copilot` extension |

| `claude-api` | LLM | Working | `ANTHROPIC_API_KEY` | prompt caching enabled |

| `github` | VCS | Working | GitHub PAT (`Contents: read`) | fetches raw diff via GitHub API |

| `ado` | VCS | Stubbed | ADO PAT (when impl lands) | UI interception works; multi-call diff adapter deferred to v0.2 |

---

## Development

### Setup

```bash

git clone https://github.com/tibtof/lgtm-buzzer.git

cd lgtm-buzzer

npm install

npm run build

```

### Common commands

```bash

npm run build          # tsc -b for all lib packages + wxt build for the extension

npm run build:libs     # tsc -b only (skip the extension)

npm test               # vitest run across all packages

npm run lint           # eslint with flat config (enforces dependency direction)

npm run format         # prettier --write

npm run typecheck:tests  # type-check all *.test.ts files (excluded from tsc -b)

npm run check          # full CI gate: build + test + lint + typecheck:tests

```

### Running e2e tests

The Playwright suite requires a display. On Linux without a desktop:

```bash

xvfb-run --auto-servernum npm test

```

On macOS, run `npm test` directly.

### Running evals

Evals make real LLM calls and are excluded from `npm run check`. They require

the adapter tools and credentials described in the prerequisites section.

```bash

npm run evals          # full suite — all adapters x all fixtures

npm run evals:quick    # fast fixtures only (ts-add-validator, dep-bump-only)

```

See [`packages/evals/README.md`](packages/evals/README.md) for the full eval

guide including the negative-control fixture and how to update the baseline.

### Workspace READMEs

- [`packages/extension/README.md`](packages/extension/README.md) — options page, quiz modal, monadyssey usage, WCAG commitments.

- [`packages/evals/README.md`](packages/evals/README.md) — promptfoo eval suite, fixtures, how to run and interpret results.

---

## Contributing

The canonical contribution flow uses the four-agent pipeline (PM → Architect →

Dev → Reviewer) described in [`CLAUDE.md`](./CLAUDE.md). See

[`CONTRIBUTING.md`](./CONTRIBUTING.md) for a short orientation.

For bug reports and feature requests, open a GitHub issue. The PM agent will

triage and file the structured spec; the architect will write an ADR; the dev

agent will implement; the reviewer agent will gate the PR before human review.

---

## Roadmap

Items planned after v0.1.0:

- **ADO multi-call diff adapter** — complete Azure DevOps support (the Approve button interception is already in; the diff adapter needs the multi-call ADO API).

- **Quiz cancel wire frame** (#96) — `quiz-cancel-request` message so the host can abort the in-flight LLM fiber and stop billing tokens when the user closes the modal.

- **OS keychain integration** — macOS Keychain and Linux SecretService for encrypted credential storage.

- **Firefox MV3 port** — Firefox MV3 compatibility (the codebase is designed for it; no architectural changes needed).

- **Dark mode** — extension UI currently follows the host page's color scheme; a first-class dark-mode pass is planned.

- **i18n** — all user-facing strings are currently English-only.

- **Chrome Web Store listing** — public listing once the extension reaches a stable UX.

- **Safari port** — wrap the MV3 extension via the Xcode converter (locked decision: post-v1.0).

---

## License

MIT — see [`LICENSE`](./LICENSE).

---

## Acknowledgments

- [monadyssey](https://github.com/lean-mind/monodyssey) — the FP foundation (`IO`, `Either`, `Option`, `Schedule`) used across every non-extension workspace.

- [WXT](https://wxt.dev) — the extension framework that handles the MV3 build, HMR, and cross-browser plumbing.

- [promptfoo](https://promptfoo.dev) — the eval framework used to measure quiz quality across LLM adapters.

- [httptape](https://github.com/tibtof/httptape) — HTTP fixture recording and replay used in adapter contract tests.

- [Anthropic Claude](https://anthropic.com) — the LLM behind the Claude CLI and API adapters, and the agent pipeline that built this project.
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/tibtof/lgtm-buzzer

Awesome Lists containing this project

README