https://github.com/predicatesystems/predicate-runtime-typescript
A verification-first runtime for AI web agents — with Jest-style assertions and token-efficient snapshots
https://github.com/predicatesystems/predicate-runtime-typescript
agent-runtime ai-agents ai-automation assertions browser-automation deterministic
Last synced: 4 days ago
JSON representation
A verification-first runtime for AI web agents — with Jest-style assertions and token-efficient snapshots
- Host: GitHub
- URL: https://github.com/predicatesystems/predicate-runtime-typescript
- Owner: PredicateSystems
- License: other
- Created: 2025-12-21T22:16:58.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2026-04-28T01:09:43.000Z (27 days ago)
- Last Synced: 2026-04-28T03:14:08.306Z (27 days ago)
- Topics: agent-runtime, ai-agents, ai-automation, assertions, browser-automation, deterministic
- Language: TypeScript
- Homepage: https://www.PredicateSystems.ai
- Size: 15.6 MB
- Stars: 2
- Watchers: 0
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# Predicate TypeScript SDK
> **A verification & control layer for AI agents that operate browsers**
Predicate is built for **AI agent developers** who already use Playwright / CDP / LangGraph and care about **flakiness, cost, determinism, evals, and debugging**.
Often described as _Jest for Browser AI Agents_ - but applied to end-to-end agent runs (not unit tests).
The core loop is:
> **Agent → Snapshot → Action → Verification → Artifact**
## What Predicate is
- A **verification-first runtime** (`AgentRuntime`) for browser agents
- Treats the browser as an adapter (Playwright / CDP); **`AgentRuntime` is the product**
- A **controlled perception** layer (semantic snapshots; pruning/limits; lowers token usage by filtering noise from what models see)
- A **debugging layer** (structured traces + failure artifacts)
- Enables **local LLM small models (3B-7B)** for browser automation (privacy, compliance, and cost control)
- Keeps vision models **optional** (use as a fallback when DOM/snapshot structure falls short, e.g. ``)
## What Predicate is not
- Not a browser driver
- Not a Playwright replacement
- Not a vision-first agent framework
## Install
```bash
npm install @predicatesystems/runtime
npx playwright install chromium
```
Legacy install compatibility remains available through the shim package:
```bash
npm install @predicatesystems/sdk
```
## Naming migration (Predicate rebrand)
Use the new `Predicate*` class names for all new code:
- `PredicateBrowser`
- `PredicateAgent`
- `PredicateVisualAgent`
- `PredicateDebugger`
- `backends.PredicateContext`
## Conceptual example (why this exists)
- Steps are **gated by verifiable UI assertions**
- If progress can’t be proven, the run **fails with evidence**
- This is how you make runs **reproducible** and **debuggable**, and how you run evals reliably
## Quickstart: a verification-first loop
```ts
import { PredicateBrowser, AgentRuntime } from '@predicatesystems/runtime';
import { JsonlTraceSink, Tracer } from '@predicatesystems/runtime';
import { exists, urlContains } from '@predicatesystems/runtime';
import type { Page } from 'playwright';
async function main(): Promise {
const tracer = new Tracer('demo', new JsonlTraceSink('trace.jsonl'));
const browser = new PredicateBrowser();
await browser.start();
const page = browser.getPage();
if (!page) throw new Error('no page');
await page.goto('https://example.com');
// AgentRuntime needs a snapshot provider; PredicateBrowser.snapshot() does not depend on Page,
// so we wrap it to fit the runtime interface.
const runtime = new AgentRuntime(
{ snapshot: async (_page: Page, options?: Record) => browser.snapshot(options) },
page,
tracer
);
runtime.beginStep('Verify homepage');
await runtime.snapshot({ limit: 60 });
runtime.assert(urlContains('example.com'), 'on_domain', true);
runtime.assert(exists('role=heading'), 'has_heading');
runtime.assertDone(exists("text~'Example'"), 'task_complete');
await browser.close();
}
void main();
```
## PredicateDebugger: attach to your existing agent framework (sidecar mode)
If you already have an agent loop (LangGraph, custom planner/executor), keep it and attach Predicate as a **verifier + trace layer**.
Key idea: your agent still executes actions — Predicate **snapshots and verifies outcomes**.
```ts
import type { Page } from 'playwright';
import {
PredicateDebugger,
Tracer,
JsonlTraceSink,
exists,
urlContains,
} from '@predicatesystems/runtime';
async function runExistingAgent(page: Page): Promise {
const tracer = new Tracer('run-123', new JsonlTraceSink('trace.jsonl'));
const dbg = PredicateDebugger.attach(page, tracer);
await dbg.step('agent_step: navigate + verify', async () => {
// 1) Let your framework do whatever it does
await yourAgent.step();
// 2) Snapshot what the agent produced
await dbg.snapshot({ limit: 60 });
// 3) Verify outcomes (with bounded retries)
await dbg
.check(urlContains('example.com'), 'on_domain', true)
.eventually({ timeoutMs: 10_000 });
await dbg.check(exists('role=heading'), 'has_heading').eventually({ timeoutMs: 10_000 });
});
}
```
## SDK-driven full loop (snapshots + actions)
If you want Predicate to drive the loop end-to-end, you can use the SDK primitives directly: take a snapshot, select elements, act, then verify.
```ts
import {
PredicateBrowser,
snapshot,
find,
typeText,
click,
waitFor,
} from '@predicatesystems/runtime';
async function loginExample(): Promise {
const browser = new PredicateBrowser();
await browser.start();
const page = browser.getPage();
if (!page) throw new Error('no page');
await page.goto('https://example.com/login');
const snap = await snapshot(browser);
const email = find(snap, "role=textbox text~'email'");
const password = find(snap, "role=textbox text~'password'");
const submit = find(snap, "role=button text~'sign in'");
if (!email || !password || !submit) throw new Error('login form not found');
await typeText(browser, email.id, 'user@example.com');
await typeText(browser, password.id, 'password123');
await click(browser, submit.id);
const ok = await waitFor(browser, "role=heading text~'Dashboard'", 10_000);
if (!ok.found) throw new Error('login failed');
await browser.close();
}
```
## Capabilities (lifecycle guarantees)
### Controlled perception
- **Semantic snapshots** instead of raw DOM dumps
- **Pruning knobs** via `SnapshotOptions` (limit/filter)
- Snapshot diagnostics that help decide when “structure is insufficient”
### Constrained action space
- Action primitives operate on **stable IDs / rects** derived from snapshots
- Optional helpers for ordinality (“click the 3rd result”)
### Verified progress
- Predicates like `exists(...)`, `urlMatches(...)`, `isEnabled(...)`, `valueEquals(...)`
- Fluent assertion DSL via `expect(...)`
- Retrying verification via `runtime.check(...).eventually(...)`
### Scroll verification (prevent no-op scroll drift)
A common agent failure mode is “scrolling” without the UI actually advancing (overlays, nested scrollers, focus issues). Use `AgentRuntime.scrollBy(...)` to deterministically verify scroll _had effect_ via before/after `scrollTop`.
```ts
runtime.beginStep('Scroll the page and verify it moved');
const ok = await runtime.scrollBy(600, {
verify: true,
minDeltaPx: 50,
label: 'scroll_effective',
required: true,
timeoutMs: 5_000,
});
if (!ok) {
throw new Error('Scroll had no effect (likely blocked by overlay or nested scroller).');
}
```
### Explained failure
- JSONL trace events (`Tracer` + `JsonlTraceSink`)
- Optional failure artifact bundles (snapshots, diagnostics, step timelines, frames/clip)
- Deterministic failure semantics: when required assertions can’t be proven, the run fails with artifacts you can replay
### Framework interoperability
- Bring your own LLM and orchestration (LangGraph, custom loops)
- Register explicit LLM-callable tools with `ToolRegistry`
## ToolRegistry (LLM-callable tools)
```ts
import { ToolRegistry, registerDefaultTools } from '@predicatesystems/runtime';
const registry = new ToolRegistry();
registerDefaultTools(registry);
const toolsForLLM = registry.llmTools();
```
## Permissions (avoid Chrome permission bubbles)
Chrome permission prompts are outside the DOM and can be invisible to snapshots. Prefer setting a policy **before navigation**.
```ts
import { PredicateBrowser } from '@predicatesystems/runtime';
import type { PermissionPolicy } from '@predicatesystems/runtime';
const policy: PermissionPolicy = {
default: 'clear',
autoGrant: ['geolocation'],
geolocation: { latitude: 37.77, longitude: -122.41, accuracy: 50 },
origin: 'https://example.com',
};
// `permissionPolicy` is the last constructor argument; pass `keepAlive` right before it.
const browser = new PredicateBrowser(
undefined,
undefined,
undefined,
undefined,
undefined,
undefined,
undefined,
undefined,
undefined,
undefined,
undefined,
undefined,
false,
policy
);
await browser.start();
```
If your backend supports it, you can also use ToolRegistry permission tools (`grant_permissions`, `clear_permissions`, `set_geolocation`) mid-run.
## Downloads (verification predicate)
```ts
import { downloadCompleted } from '@predicatesystems/runtime';
runtime.assert(downloadCompleted('report.csv'), 'download_ok', true);
```
## Debugging (fast)
- **Manual driver CLI**:
```bash
npx predicate driver --url https://example.com
```
- **Verification + artifacts + debugging with time-travel traces (Predicate Studio demo)**:
If the video tag doesn’t render in your GitHub README view, use this link: [`sentience-studio-demo.mp4`](https://github.com/user-attachments/assets/7ffde43b-1074-4d70-bb83-2eb8d0469307)
- **Predicate SDK Documentation**: https://predicatelabs.dev/docs