https://github.com/pmatos/forseti

Forseti — a formal verifier (ESBMC) inside the agent coding loop: write → verify → counterexample → fix. Q.E.D. is the new LGTM.
https://github.com/pmatos/forseti

agentic-coding bounded-model-checking claude-code esbmc formal-verification llm program-verification

Last synced: 2 days ago
JSON representation

Forseti — a formal verifier (ESBMC) inside the agent coding loop: write → verify → counterexample → fix. Q.E.D. is the new LGTM.

Host: GitHub
URL: https://github.com/pmatos/forseti
Owner: pmatos
Created: 2026-06-16T09:40:02.000Z (14 days ago)
Default Branch: main
Last Pushed: 2026-06-25T13:03:55.000Z (5 days ago)
Last Synced: 2026-06-25T15:16:36.760Z (4 days ago)
Topics: agentic-coding, bounded-model-checking, claude-code, esbmc, formal-verification, llm, program-verification
Language: Python
Size: 103 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 17
Metadata Files:
- Readme: README.md
- Roadmap: docs/roadmap.md

Awesome Lists containing this project

README

          # Forseti

> *Forseti — the Norse god who presides over every dispute and hands down the fairest

> judgment, in his hall Glitnir. This system judges code and returns a verdict:

> **VERIFIED**, or a concrete **counterexample**.*

**Forseti puts a formal verifier inside the agent's coding loop.** As coding agents write

faster than anyone can review, trust has to move from *"a human looked at it"* (LGTM) to

*"the code carries a proof"* (Q.E.D.). Forseti closes that loop:

```

write  →  verify (ESBMC)  →  counterexample  →  fix  →  ↺

```

The agent never hands you the broken draft — you see the version that already passed.

## The bar (Jan 2027)

A **demoable** ESBMC-in-the-loop system that self-corrects on a **non-toy** example,

plus a public writeup. Engine fixes and a Lean backend are *supporting*, not the metric.

## Status

H2 2026 Igalia investment (Jul 2026 → Jan 2027). Currently: **P0 · Foundation**.

## Layout

| Path | What |

|---|---|

| [`docs/roadmap.md`](docs/roadmap.md) | The 6-phase plan, exit criteria, and risk register |

| [`docs/adr/`](docs/adr/) | Architecture Decision Records — the *why* behind each choice |

| [`docs/design/`](docs/design/) | Design RFCs — strawmen (with diagrams) under discussion |

| [`docs/walkthroughs/`](docs/walkthroughs/) | Worked end-to-end loop turns (the P0 manual `abs`/`INT_MIN` turn) |

| `src/` | The orchestrator loop, property store, and grading harness (built P1+) |

## What it builds on

- **[ESBMC](https://github.com/esbmc/esbmc)** — the multi-language bounded model checker

  (C, C++, Python, Solidity, CUDA, Java/Kotlin) that does the proving.

- The **[ESBMC Claude Code plugin](https://github.com/esbmc)** (`/esbmc:verify`, `/esbmc:audit`)

  — already shipped; the loop dogfoods and extends it.

- **GEPA** ([arXiv:2507.19457](https://arxiv.org/abs/2507.19457)) — reflective prompt

  evolution, used to teach an LLM to propose *good* properties.

- **LLM-generated invariants for BMC** — Pirzada et al., ASE '24.

## Scope guardrail

[Vow](https://vow-lang.com) — a *proof-native* language — is the bigger, separate bet.

Forseti retrofits proofs onto code agents already write; it does **not** include Vow.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/pmatos/forseti

Awesome Lists containing this project

README