An open API service indexing awesome lists of open source software.

https://github.com/geval-labs/geval

Decision orchestration and reconciliation for AI changes.
https://github.com/geval-labs/geval

ai-agents aievals evals evaluation geval llm-evaluation llms open-source

Last synced: 2 months ago
JSON representation

Decision orchestration and reconciliation for AI changes.

Awesome Lists containing this project

README

          


Geval

# Geval


Decision orchestration and reconciliation for AI changes.


You bring all kinds of signals and your rules. Geval orchestrates and reconciles them into one outcome. No brain — just your rules applied, every time.


Release
MIT License
CI

---

## Demo video

**[Watch the Geval demo on YouTube →](https://youtu.be/v6LuxIshgDU)** — walkthrough of how Geval turns signals and policy rules into **PASS**, **REQUIRE_APPROVAL**, or **BLOCK**.

---

## Try it in under a minute

**1. Download** (pick your OS):

```bash
# Linux
curl -sSL https://github.com/geval-labs/geval/releases/latest/download/geval-linux-x86_64 -o geval && chmod +x geval

# macOS (Apple Silicon)
curl -sSL https://github.com/geval-labs/geval/releases/latest/download/geval-macos-aarch64 -o geval && chmod +x geval

# Windows (PowerShell) — see note below
Invoke-WebRequest -Uri https://github.com/geval-labs/geval/releases/latest/download/geval-windows-x86_64.exe -OutFile geval.exe
```

> **Windows:** Open **PowerShell as Administrator** (right‑click → *Run as administrator*). Then run the download command and `.\geval.exe demo`.

**2. Run the demo** (no files needed):

```bash
./geval demo # Linux / macOS — use ./ so you run this binary, not another "geval" in PATH
.\geval.exe demo # Windows (same folder as geval.exe)
```

You get a report and one outcome: **PASS**, **REQUIRE_APPROVAL**, or **BLOCK** — produced by the demo contract and signals. [Use in CI →](geval/docs/github-actions.md)

**No binary for your OS?** [Build from source](geval/docs/installation.md#build-from-source).

> **If you see "unknown command 'init'" or "required option '--eval'"** — you're running a **different** program named `geval` (e.g. from npm or another install). Use the **binary from [Releases](https://github.com/geval-labs/geval/releases)** or build from source and run it with `./geval` (or put it first in your PATH).

### Start from a template (like create-react-app)

Inside your project (your codebase is not changed except for one new folder), run the **same binary** you downloaded (e.g. `./geval`):

```bash
./geval init # or: /path/to/geval init
```

This creates a **.geval** folder with:

- **contract.yaml** — Names your release gate, versions it, and lists policy files to evaluate.
- **policies/** — Two starter files with descriptive names (`safety-and-blocking.yaml`, `quality-and-approval.yaml`); edit rules to match your metrics.
- **signals.json** — Example pipeline metrics; replace with your real signal names and values.
- **README.md** — What each file is for and how to run checks.

Then run:

```bash
./geval check --contract .geval/contract.yaml --signals .geval/signals.json
```

Use a different folder: `./geval init my-rules`. Overwrite existing files: `./geval init --force`.

### Updating

Use the same download commands. Replace your old file with the new one. Check version: `geval --version`.

---

## Use Geval with your own signals and contract

You need a **contract** (one YAML that references one or more **policy** files) and a **signals** file. Geval evaluates each policy against the same signals, then combines outcomes (e.g. all must pass, or any block blocks). Use `geval init` for a template with a contract and two policies, or create the files yourself below.

**All kinds of signals:** Not every signal needs a score. You can mix: entries with a numeric `value`, and entries with no value (presence-only). Use a rule with `operator: presence` to match “this metric exists.” [Details →](geval/docs/signals-and-rules.md)

### Step 1: Your signals (data file)

A list of evidence: what you measured, observed, or flagged. Each item has a **metric** (name). **Value** is optional — use it for scores; omit it for “this happened” (presence-only).

Example — save as `mydata.json`:

```json
{
"signals": [
{ "metric": "accuracy", "value": 0.94 },
{ "metric": "engagement_drop", "value": 0.02 }
]
}
```

You can add labels like `component` or `system` if you need them. [Full example →](geval/examples/signals.json)

### Step 2: Your contract and policies

A **contract** is a YAML file that lists one or more **policy** files and a **combination rule** (how to merge their outcomes). Each **policy** file contains rules with **unique** priorities: **When** [condition on signals], **then** [pass / block / require_approval].

**Prefer a form instead of writing YAML by hand?** Use **[config.geval.io](https://config.geval.io)** to generate Geval-compatible `contract.yaml` and policy files (download or copy), then validate with `geval validate-contract` and run `geval check` as below.

Example contract — save as `contract.yaml`:

```yaml
name: my-gate
version: "1.0.0"
combine: worst_case
policies:
- path: policy.yaml
```

Example policy — save as `policy.yaml` (path relative to the contract file):

```yaml
name: quality
version: "1.0.0"
policy:
rules:
- priority: 1
name: block_bad_engagement
when:
metric: engagement_drop
operator: ">"
threshold: 0
then:
action: block
- priority: 2
name: allow_good_accuracy
when:
metric: accuracy
operator: ">="
threshold: 0.9
then:
action: pass
```

**Combine (`worst_case`):** any **BLOCK** wins; else any **require_approval**; else **pass**. **Rule priorities** must be **unique** per policy; **1** = highest; Geval records every match and the **best** priority wins. **Operators:** `>`, `<`, `>=`, `<=`, `==`, `presence`. **Actions:** `pass`, `block`, `require_approval`.

[Full example →](geval/examples/contract.yaml) and [policy →](geval/examples/policy.yaml)

### Step 3: Run Geval

```bash
./geval check --contract contract.yaml --signals mydata.json
```

(Windows: `.\geval.exe check --contract contract.yaml --signals mydata.json`)

### Step 4: Read the outcome

- **PASS** — Every policy passed (or combined rule says go).
- **REQUIRE_APPROVAL** — At least one policy requires approval.
- **BLOCK** — At least one policy blocks.

To see **per-policy results** and the combined decision:

```bash
./geval explain --contract contract.yaml --signals mydata.json
```

To validate the contract and all referenced policies:

```bash
./geval validate-contract contract.yaml
```

---


The problem
What Geval is
Commands
Docs

---

## The problem

You have many signals: scores, A/B results, human reviews, flags. You change a model or a prompt. Then what?

- One signal says “better.”
- Another says “worse.”
- Someone asks: “Do we ship?”

Today that call happens in chat or a meeting. Hard to repeat. Hard to audit. You don't need a system that "decides" for you — you need **orchestration and reconciliation**: one place to define rules, one place to feed all your signals (not just numbers), and one deterministic outcome every time.

---

## What Geval is

**Geval is a decision orchestration and reconciliation engine.** It does not make decisions. It has no brain. You provide:

1. **Your signals** (one file) — any kind: scores, presence-only, flags, labels. Non-uniform is fine.
2. **Your rules** (one file) — e.g. “If engagement drops, block. If accuracy is below X, need approval.”

Geval **orchestrates** the run and **reconciles** your signals against your rules in order. Same inputs + same rules = same outcome. It returns:

| Outcome | Meaning |
|--------|--------|
| **PASS** | No rule matched a block or require-approval. Good to go. |
| **REQUIRE_APPROVAL** | A rule matched; it says a person must approve first. |
| **BLOCK** | A rule matched; it says don’t ship. Fix first. |

Each run is recorded: which rules, which signals, when. So you can always answer: “Why did we ship?” and “Who approved?” — without any black box.

---

## Commands

Run with `./geval` (or ensure this repo’s binary is the one in your PATH):

| Command | What it does |
|--------|----------------|
| `./geval demo` | Run the built-in example. Try this first. |
| `./geval init` | Create .geval/ with contract and policies. Edit and run. |
| `./geval check --contract --signals ` | Evaluate contract → one outcome (PASS / REQUIRE_APPROVAL / BLOCK) |
| `./geval explain --contract --signals ` | Per-policy results and combined decision report |
| `./geval validate-contract ` | Validate contract and all referenced policies |
| `./geval approve` / `./geval reject` | Record a person’s approval or rejection |

---

## Documentation

| Guide | Description |
|-------|-------------|
| [**Demo video (YouTube)**](https://youtu.be/v6LuxIshgDU) | Walkthrough of Geval |
| [**Config generator (web)**](https://config.geval.io) | Fill in forms → download `contract.yaml` and policies |
| [**Architecture**](geval/docs/architecture.md) | Contract = multiple policies + combine rule; module layout |
| [**Signals and rules**](geval/docs/signals-and-rules.md) | Non-uniform signals (scores, presence-only, mix); how rules use them |
| [**Signal assumptions**](geval/docs/signal-assumptions.md) | What we assume; what input forms we accept (number, string, trace, object) |
| [**Versioning**](geval/docs/versioning.md) | Contract, policy, and signals versioning; nothing unversioned |
| [**Extending**](geval/docs/extending.md) | How to add a combination rule or change behavior; process and conventions |
| [**GitHub Actions**](geval/docs/github-actions.md) | Use Geval in CI |
| [**Examples**](geval/examples/README.md) | Sample data and rules files |
| [**Customer demo (feature story)**](geval/docs/customer-demo-feature.md) | Signals, policies, rules, and PASS/BLOCK/approval narrative for demos |
| [**Installation**](geval/docs/installation.md) | Install, PATH, build from source |
| [**Developer workflow**](geval/docs/developer-workflow.md) | PRs, check, approve/reject |
| [**Auditing**](geval/docs/auditing.md) | How decisions are recorded |

---

## Contributing

Contributions welcome. [CONTRIBUTING.md](CONTRIBUTING.md). Build from source: [Installation](geval/docs/installation.md#build-from-source).

---

## License

MIT © [Geval Contributors](https://github.com/geval-labs/geval/graphs/contributors)

---


Website
Demo video
Config generator
Releases
GitHub