https://github.com/predicatesystems/predicate-runtime-typescript

A verification-first runtime for AI web agents — with Jest-style assertions and token-efficient snapshots
https://github.com/predicatesystems/predicate-runtime-typescript

agent-runtime ai-agents ai-automation assertions browser-automation deterministic

Last synced: 4 days ago
JSON representation

A verification-first runtime for AI web agents — with Jest-style assertions and token-efficient snapshots

Host: GitHub
URL: https://github.com/predicatesystems/predicate-runtime-typescript
Owner: PredicateSystems
License: other
Created: 2025-12-21T22:16:58.000Z (5 months ago)
Default Branch: main
Last Pushed: 2026-04-28T01:09:43.000Z (27 days ago)
Last Synced: 2026-04-28T03:14:08.306Z (27 days ago)
Topics: agent-runtime, ai-agents, ai-automation, assertions, browser-automation, deterministic
Language: TypeScript
Homepage: https://www.PredicateSystems.ai
Size: 15.6 MB
Stars: 2
Watchers: 0
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE

Awesome Lists containing this project

README

          # Predicate TypeScript SDK

> **A verification & control layer for AI agents that operate browsers**

Predicate is built for **AI agent developers** who already use Playwright / CDP / LangGraph and care about **flakiness, cost, determinism, evals, and debugging**.

Often described as _Jest for Browser AI Agents_ - but applied to end-to-end agent runs (not unit tests).

The core loop is:

> **Agent → Snapshot → Action → Verification → Artifact**

## What Predicate is

- A **verification-first runtime** (`AgentRuntime`) for browser agents

- Treats the browser as an adapter (Playwright / CDP); **`AgentRuntime` is the product**

- A **controlled perception** layer (semantic snapshots; pruning/limits; lowers token usage by filtering noise from what models see)

- A **debugging layer** (structured traces + failure artifacts)

- Enables **local LLM small models (3B-7B)** for browser automation (privacy, compliance, and cost control)

- Keeps vision models **optional** (use as a fallback when DOM/snapshot structure falls short, e.g. ``)

## What Predicate is not

- Not a browser driver

- Not a Playwright replacement

- Not a vision-first agent framework

## Install

```bash

npm install @predicatesystems/runtime

npx playwright install chromium

```

Legacy install compatibility remains available through the shim package:

```bash

npm install @predicatesystems/sdk

```

## Naming migration (Predicate rebrand)

Use the new `Predicate*` class names for all new code:

- `PredicateBrowser`

- `PredicateAgent`

- `PredicateVisualAgent`

- `PredicateDebugger`

- `backends.PredicateContext`

## Conceptual example (why this exists)

- Steps are **gated by verifiable UI assertions**

- If progress can’t be proven, the run **fails with evidence**

- This is how you make runs **reproducible** and **debuggable**, and how you run evals reliably

## Quickstart: a verification-first loop

```ts

import { PredicateBrowser, AgentRuntime } from '@predicatesystems/runtime';

import { JsonlTraceSink, Tracer } from '@predicatesystems/runtime';

import { exists, urlContains } from '@predicatesystems/runtime';

import type { Page } from 'playwright';

async function main(): Promise {

  const tracer = new Tracer('demo', new JsonlTraceSink('trace.jsonl'));

  const browser = new PredicateBrowser();

  await browser.start();

  const page = browser.getPage();

  if (!page) throw new Error('no page');

  await page.goto('https://example.com');

  // AgentRuntime needs a snapshot provider; PredicateBrowser.snapshot() does not depend on Page,

  // so we wrap it to fit the runtime interface.

  const runtime = new AgentRuntime(

    { snapshot: async (_page: Page, options?: Record) => browser.snapshot(options) },

    page,

    tracer

  );

  runtime.beginStep('Verify homepage');

  await runtime.snapshot({ limit: 60 });

  runtime.assert(urlContains('example.com'), 'on_domain', true);

  runtime.assert(exists('role=heading'), 'has_heading');

  runtime.assertDone(exists("text~'Example'"), 'task_complete');

  await browser.close();

}

void main();

```

## PredicateDebugger: attach to your existing agent framework (sidecar mode)

If you already have an agent loop (LangGraph, custom planner/executor), keep it and attach Predicate as a **verifier + trace layer**.

Key idea: your agent still executes actions — Predicate **snapshots and verifies outcomes**.

```ts

import type { Page } from 'playwright';

import {

  PredicateDebugger,

  Tracer,

  JsonlTraceSink,

  exists,

  urlContains,

} from '@predicatesystems/runtime';

async function runExistingAgent(page: Page): Promise {

  const tracer = new Tracer('run-123', new JsonlTraceSink('trace.jsonl'));

  const dbg = PredicateDebugger.attach(page, tracer);

  await dbg.step('agent_step: navigate + verify', async () => {

    // 1) Let your framework do whatever it does

    await yourAgent.step();

    // 2) Snapshot what the agent produced

    await dbg.snapshot({ limit: 60 });

    // 3) Verify outcomes (with bounded retries)

    await dbg

      .check(urlContains('example.com'), 'on_domain', true)

      .eventually({ timeoutMs: 10_000 });

    await dbg.check(exists('role=heading'), 'has_heading').eventually({ timeoutMs: 10_000 });

  });

}

```

## SDK-driven full loop (snapshots + actions)

If you want Predicate to drive the loop end-to-end, you can use the SDK primitives directly: take a snapshot, select elements, act, then verify.

```ts

import {

  PredicateBrowser,

  snapshot,

  find,

  typeText,

  click,

  waitFor,

} from '@predicatesystems/runtime';

async function loginExample(): Promise {

  const browser = new PredicateBrowser();

  await browser.start();

  const page = browser.getPage();

  if (!page) throw new Error('no page');

  await page.goto('https://example.com/login');

  const snap = await snapshot(browser);

  const email = find(snap, "role=textbox text~'email'");

  const password = find(snap, "role=textbox text~'password'");

  const submit = find(snap, "role=button text~'sign in'");

  if (!email || !password || !submit) throw new Error('login form not found');

  await typeText(browser, email.id, 'user@example.com');

  await typeText(browser, password.id, 'password123');

  await click(browser, submit.id);

  const ok = await waitFor(browser, "role=heading text~'Dashboard'", 10_000);

  if (!ok.found) throw new Error('login failed');

  await browser.close();

}

```

## Capabilities (lifecycle guarantees)

### Controlled perception

- **Semantic snapshots** instead of raw DOM dumps

- **Pruning knobs** via `SnapshotOptions` (limit/filter)

- Snapshot diagnostics that help decide when “structure is insufficient”

### Constrained action space

- Action primitives operate on **stable IDs / rects** derived from snapshots

- Optional helpers for ordinality (“click the 3rd result”)

### Verified progress

- Predicates like `exists(...)`, `urlMatches(...)`, `isEnabled(...)`, `valueEquals(...)`

- Fluent assertion DSL via `expect(...)`

- Retrying verification via `runtime.check(...).eventually(...)`

### Scroll verification (prevent no-op scroll drift)

A common agent failure mode is “scrolling” without the UI actually advancing (overlays, nested scrollers, focus issues). Use `AgentRuntime.scrollBy(...)` to deterministically verify scroll _had effect_ via before/after `scrollTop`.

```ts

runtime.beginStep('Scroll the page and verify it moved');

const ok = await runtime.scrollBy(600, {

  verify: true,

  minDeltaPx: 50,

  label: 'scroll_effective',

  required: true,

  timeoutMs: 5_000,

});

if (!ok) {

  throw new Error('Scroll had no effect (likely blocked by overlay or nested scroller).');

}

```

### Explained failure

- JSONL trace events (`Tracer` + `JsonlTraceSink`)

- Optional failure artifact bundles (snapshots, diagnostics, step timelines, frames/clip)

- Deterministic failure semantics: when required assertions can’t be proven, the run fails with artifacts you can replay

### Framework interoperability

- Bring your own LLM and orchestration (LangGraph, custom loops)

- Register explicit LLM-callable tools with `ToolRegistry`

## ToolRegistry (LLM-callable tools)

```ts

import { ToolRegistry, registerDefaultTools } from '@predicatesystems/runtime';

const registry = new ToolRegistry();

registerDefaultTools(registry);

const toolsForLLM = registry.llmTools();

```

## Permissions (avoid Chrome permission bubbles)

Chrome permission prompts are outside the DOM and can be invisible to snapshots. Prefer setting a policy **before navigation**.

```ts

import { PredicateBrowser } from '@predicatesystems/runtime';

import type { PermissionPolicy } from '@predicatesystems/runtime';

const policy: PermissionPolicy = {

  default: 'clear',

  autoGrant: ['geolocation'],

  geolocation: { latitude: 37.77, longitude: -122.41, accuracy: 50 },

  origin: 'https://example.com',

};

// `permissionPolicy` is the last constructor argument; pass `keepAlive` right before it.

const browser = new PredicateBrowser(

  undefined,

  undefined,

  undefined,

  undefined,

  undefined,

  undefined,

  undefined,

  undefined,

  undefined,

  undefined,

  undefined,

  undefined,

  false,

  policy

);

await browser.start();

```

If your backend supports it, you can also use ToolRegistry permission tools (`grant_permissions`, `clear_permissions`, `set_geolocation`) mid-run.

## Downloads (verification predicate)

```ts

import { downloadCompleted } from '@predicatesystems/runtime';

runtime.assert(downloadCompleted('report.csv'), 'download_ok', true);

```

## Debugging (fast)

- **Manual driver CLI**:

```bash

npx predicate driver --url https://example.com

```

- **Verification + artifacts + debugging with time-travel traces (Predicate Studio demo)**:

If the video tag doesn’t render in your GitHub README view, use this link: [`sentience-studio-demo.mp4`](https://github.com/user-attachments/assets/7ffde43b-1074-4d70-bb83-2eb8d0469307)

- **Predicate SDK Documentation**: https://predicatelabs.dev/docs

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/predicatesystems/predicate-runtime-typescript

Awesome Lists containing this project

README