https://github.com/pmatos/forseti
Forseti — a formal verifier (ESBMC) inside the agent coding loop: write → verify → counterexample → fix. Q.E.D. is the new LGTM.
https://github.com/pmatos/forseti
agentic-coding bounded-model-checking claude-code esbmc formal-verification llm program-verification
Last synced: 2 days ago
JSON representation
Forseti — a formal verifier (ESBMC) inside the agent coding loop: write → verify → counterexample → fix. Q.E.D. is the new LGTM.
- Host: GitHub
- URL: https://github.com/pmatos/forseti
- Owner: pmatos
- Created: 2026-06-16T09:40:02.000Z (14 days ago)
- Default Branch: main
- Last Pushed: 2026-06-25T13:03:55.000Z (5 days ago)
- Last Synced: 2026-06-25T15:16:36.760Z (4 days ago)
- Topics: agentic-coding, bounded-model-checking, claude-code, esbmc, formal-verification, llm, program-verification
- Language: Python
- Size: 103 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 17
-
Metadata Files:
- Readme: README.md
- Roadmap: docs/roadmap.md
Awesome Lists containing this project
README
# Forseti
> *Forseti — the Norse god who presides over every dispute and hands down the fairest
> judgment, in his hall Glitnir. This system judges code and returns a verdict:
> **VERIFIED**, or a concrete **counterexample**.*
**Forseti puts a formal verifier inside the agent's coding loop.** As coding agents write
faster than anyone can review, trust has to move from *"a human looked at it"* (LGTM) to
*"the code carries a proof"* (Q.E.D.). Forseti closes that loop:
```
write → verify (ESBMC) → counterexample → fix → ↺
```
The agent never hands you the broken draft — you see the version that already passed.
## The bar (Jan 2027)
A **demoable** ESBMC-in-the-loop system that self-corrects on a **non-toy** example,
plus a public writeup. Engine fixes and a Lean backend are *supporting*, not the metric.
## Status
H2 2026 Igalia investment (Jul 2026 → Jan 2027). Currently: **P0 · Foundation**.
## Layout
| Path | What |
|---|---|
| [`docs/roadmap.md`](docs/roadmap.md) | The 6-phase plan, exit criteria, and risk register |
| [`docs/adr/`](docs/adr/) | Architecture Decision Records — the *why* behind each choice |
| [`docs/design/`](docs/design/) | Design RFCs — strawmen (with diagrams) under discussion |
| [`docs/walkthroughs/`](docs/walkthroughs/) | Worked end-to-end loop turns (the P0 manual `abs`/`INT_MIN` turn) |
| `src/` | The orchestrator loop, property store, and grading harness (built P1+) |
## What it builds on
- **[ESBMC](https://github.com/esbmc/esbmc)** — the multi-language bounded model checker
(C, C++, Python, Solidity, CUDA, Java/Kotlin) that does the proving.
- The **[ESBMC Claude Code plugin](https://github.com/esbmc)** (`/esbmc:verify`, `/esbmc:audit`)
— already shipped; the loop dogfoods and extends it.
- **GEPA** ([arXiv:2507.19457](https://arxiv.org/abs/2507.19457)) — reflective prompt
evolution, used to teach an LLM to propose *good* properties.
- **LLM-generated invariants for BMC** — Pirzada et al., ASE '24.
## Scope guardrail
[Vow](https://vow-lang.com) — a *proof-native* language — is the bigger, separate bet.
Forseti retrofits proofs onto code agents already write; it does **not** include Vow.