https://github.com/vim89/proof-gate
https://github.com/vim89/proof-gate
Last synced: about 1 month ago
JSON representation
- Host: GitHub
- URL: https://github.com/vim89/proof-gate
- Owner: vim89
- License: mit
- Created: 2026-05-12T20:34:12.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2026-05-13T14:29:20.000Z (about 1 month ago)
- Last Synced: 2026-05-24T03:33:56.922Z (about 1 month ago)
- Language: Scala
- Size: 106 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# ProofGate
Nice try, LLM. Now prove it.
ProofGate is a Scala-native review conveyor for LLM-written data-pipeline code.
The core idea is simple:
1. proposal
2. proof
3. policy
4. pin
5. people
The first reviewer should be machine-enforced structure, not a human and not another LLM.
After those gates pass, review is AI and human, not AI or human.
## Review flow
```mermaid
flowchart LR
A[LLM draft] --> B[proposal]
B --> C[proof
macros inline typestate]
C --> D[policy
Scalafix rules]
D --> E[pin
runtime sink checks]
E --> F[people
AI and human review]
C -. fail .-> X[reject early]
D -. fail .-> X
E -. fail .-> X
```
## Current stack
This repository currently targets the stable stack:
- JDK `21`
- Scala `3.8.3`
- sbt `1.12.11`
- Scalafmt `3.11.1`
- Scalafix / sbt-scalafix `0.14.6`
- MUnit `1.3.0`
## Design notes
- [ADR 0001: review conveyor](docs/adr/0001-review-conveyor.md)
## Module layout
- `modules/model`: core ADTs, typed errors, report model
- `modules/proof`: proposal DSL, typestate builder, `SchemaConforms` proof policies
- `modules/runtime-spark`: runtime shape derivation, sink-boundary pin checks, Spark schema adapter
- `modules/cli`: CLI entry point
- `modules/examples`: tiny usage examples
- `modules/fixtures-compile-fail`: reference snippets for compile-fail demos
- `modules/fixtures-policy-fail`: deliberate violations that Scalafix must reject;
not aggregated by the root project, run on demand via `scripts/check-policy-fixtures.sh`
## Commands
```bash
sbt test
sbt reviewGates
sbt reviewPolicy
sbt reviewConveyor
scripts/check-policy-fixtures.sh
sbt "cli / runMain proofgate.cli.Main review --revision abc123 --out target/proof-gate-review.md"
sbt "cli / runMain proofgate.cli.Main review --revision abc123 --out target/proof-gate-review.json"
```
The GitHub Actions workflow runs the same conveyor and publishes the generated Markdown report to the job summary.
Example rejecting review packet:
```bash
sbt "cli / runMain proofgate.cli.Main review --revision abc123 --finding proof|blocker|proof.schema-exact|Missing_customer_id|pipelines/orders.scala|Add_the_field"
```
## Policy fixtures
`reviewPolicy` uses Scalafix `DisableSyntax` rules to reject selected forbidden patterns in core code.
The expected-failure fixtures prove that these patterns are blocked:
- direct `sys.env`
- direct `System.getenv`
- direct `ConfigFactory`
- raw `try` / `catch`
- raw `throw`
- raw `require`
## Schema policies
`SchemaConforms[Out, Contract, Policy]` is the compile-time evidence the proof layer requires.
- `SchemaPolicy.Exact`: unordered by name, case-insensitive, same field set, types, and nesting.
Field-level nullability is ignored; nested optionality inside arrays and maps is still checked.
- `SchemaPolicy.ExactUnorderedCI`: explicit alias for unordered case-insensitive exact matching.
- `SchemaPolicy.ExactOrdered`: ordered by field name, case-sensitive.
- `SchemaPolicy.ExactOrderedCI`: ordered by field name, case-insensitive.
- `SchemaPolicy.ExactByPosition`: ordered by position only; field names are ignored.
- `SchemaPolicy.Backward`: `Out` may add fields beyond `Contract`. Required `Contract`
fields must still exist in `Out` with the declared type. Missing optional `Contract`
fields and missing fields with defaults are accepted. Old consumers reading `Contract`
keep working when the producer adds fields.
- `SchemaPolicy.Forward`: `Out` may drop fields that `Contract` declares. Extra fields
in `Out` and type drift still fail. Old producers stay compatible with new consumers
that have widened the contract.
- `SchemaPolicy.Full`: escape hatch for demo and migration paths; compile-time and runtime
structural checks accept the shape.
These names intentionally mirror `compile-time-data-contracts`. ProofGate keeps the same policy
vocabulary so the talk can show a clean migration from the earlier proof asset into the review
conveyor.
## Spark bridge
`SparkSchemaAdapter` converts real Spark `StructType` values into a `RuntimeShape`.
The adapter preserves nested struct nullability, array `containsNull`, map `valueContainsNull`,
and `ctdc.hasDefault` metadata:
```scala
val shape = SparkSchemaAdapter.fromStructType(dataset.schema)
```
For thin integration boundaries, callers can still pass a small DTO:
```scala
val info = structType.fields.map(f =>
SparkFieldInfo(f.name, f.dataType.simpleString, f.nullable)
).toVector
val shape = SparkSchemaAdapter.fromSparkFields(info)
```
The DTO path understands Spark primitive types, `array`, `map`, and nested `struct<...>`
expressions, but Spark `simpleString` does not carry nested struct nullability. Use
`fromStructType` when runtime parity matters.
See [docs/spark-bridge.md](docs/spark-bridge.md) for an end-to-end recipe, including the
`Dataset[A] => RuntimeShape` helper and a sink-time validate call.
## Status
This is an early POC scaffold.
The current code proves the build, module wiring, typed-error model, typestate assembly,
the full compile-time policy vocabulary from `compile-time-data-contracts`, an in-memory
runtime shape pin with matching policy entry points, a Spark schema bridge, and a CI-friendly
Markdown and JSON review report path.