https://github.com/mizcausevic-dev/ai-evidence-format-spec

AI Evidence Format v0.1 draft. JSON document format for structured citations that travel with LLM-generated claims: source identity, span selector, retrieval confidence, freshness, content hash, declared synthesis role. Part of the Kinetic Gain Protocol Suite.
https://github.com/mizcausevic-dev/ai-evidence-format-spec

ai-citations ai-governance answer-engine evidence json-schema kinetic-gain-protocol-suite llm platform-engineering protocol provenance rag specification

Last synced: about 2 months ago
JSON representation

Host: GitHub
URL: https://github.com/mizcausevic-dev/ai-evidence-format-spec
Owner: mizcausevic-dev
License: agpl-3.0
Created: 2026-05-12T04:49:50.000Z (2 months ago)
Default Branch: main
Last Pushed: 2026-05-12T04:53:00.000Z (2 months ago)
Last Synced: 2026-05-12T06:34:24.175Z (2 months ago)
Topics: ai-citations, ai-governance, answer-engine, evidence, json-schema, kinetic-gain-protocol-suite, llm, platform-engineering, protocol, provenance, rag, specification
Homepage: https://github.com/mizcausevic-dev/ai-evidence-format-spec
Size: 20.5 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # AI Evidence Format

[![Validate examples](https://github.com/mizcausevic-dev/ai-evidence-format-spec/actions/workflows/validate.yml/badge.svg)](https://github.com/mizcausevic-dev/ai-evidence-format-spec/actions/workflows/validate.yml)

[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](https://opensource.org/licenses/MIT)

A draft specification for **machine-readable evidence objects** that travel with every claim an answer engine produces.

When an LLM says *"Cambridge is in Massachusetts, source: en.wikipedia.org/wiki/Cambridge,_Massachusetts"*, the answer is two things: the **claim** and the **evidence**. Today the evidence is unstructured — a URL, maybe a quoted span, maybe nothing. The AI Evidence Format makes it structured: source identity, span selector, retrieval confidence, freshness, content hash, and a declared synthesis role.

## The three pillars

| Pillar | What it does |

|---|---|

| **Attach** | Every cited claim carries one or more evidence objects in a defined format |

| **Verify** | Each evidence object carries a content hash (and optional signature) so consumers can detect tampering or staleness |

| **Synthesize** | Each object declares its role in the answer — `supporting`, `contradicting`, `partial`, `background` — making "the model cited two sources that disagree" a first-class fact |

## Why not just a URL?

A URL in a footnote tells you where to look. It does not tell you:

- **Which span** of the source was used (the page is 12,000 words; which sentence?)

- **How confident** the retrieval system was that this was relevant

- **How fresh** the content was when retrieved

- **Whether the content hash matches** what the model actually consumed (the page may have changed)

- **What role** the evidence played — did the model use it to support, contradict, or only as background?

The AI Evidence Format makes each of those answerable in a single JSON object.

## Why not Schema.org `Citation` or W3C Annotations?

Those vocabularies describe citations as documents. The AI Evidence Format describes citations as *retrieval artifacts in a generative pipeline*. The differences:

- We need retrieval method and confidence (vector / keyword / graph / hybrid)

- We need freshness at the moment of retrieval (not document publication date)

- We need synthesis role (the model's *intended use* of the evidence)

- We need a content hash that lets a consumer verify the model read what it claims to have read

The format **reuses** Schema.org wherever it fits, but it is not a subset of any existing vocabulary.

## Quickstart

1. For each claim your answer engine produces, build one or more evidence objects conforming to [`evidence.schema.json`](evidence.schema.json).

2. Compute a `content_hash` over the canonicalized bytes of the cited span. (See §5 of [`SPEC.md`](SPEC.md) for the canonicalization rules.)

3. Either embed the evidence inline in your answer payload, or publish it at a URI and reference it.

4. Pair with an [AEO Protocol](https://github.com/mizcausevic-dev/aeo-protocol-spec) declaration on the source origin to give consumers an authoritative anchor.

## Files in this repo

- [`SPEC.md`](SPEC.md) — full v0.1 specification

- [`evidence.schema.json`](evidence.schema.json) — JSON Schema (draft 2020-12)

- [`examples/`](examples/) — reference documents for a supporting citation, a contradicting citation, and a background-only citation

## Status

**v0.1 draft.** Issues and pull requests welcome.

## License

MIT-licensed. The specification text, JSON Schema, and example documents in this repository may be freely implemented, extended, redistributed, or incorporated into commercial or non-commercial products with attribution. Reference implementations of this spec (such as [mcp-kinetic-gain](https://github.com/mizcausevic-dev/mcp-kinetic-gain)) are licensed separately under AGPL-3.0.

## Kinetic Gain Protocol Suite

A family of open specifications for the answer-engine era. Each spec is a self-contained JSON document format with its own JSON Schema and reference examples; together they compose into an end-to-end account of entity, agent, prompt, tool, and citation.

| Spec | What it does |

|---|---|

| [AEO Protocol](https://github.com/mizcausevic-dev/aeo-protocol-spec) | Entity declaration at `/.well-known/aeo.json` — authoritative claims, citation preferences, audit hooks |

| [Prompt Provenance](https://github.com/mizcausevic-dev/prompt-provenance-spec) | Versioned, lineaged, reviewable LLM prompt records |

| [Agent Cards](https://github.com/mizcausevic-dev/agent-cards-spec) | Declarative agent capability and refusal disclosure |

| **[AI Evidence Format](https://github.com/mizcausevic-dev/ai-evidence-format-spec)** | Structured citations that travel with LLM-generated claims |

| [MCP Tool Cards](https://github.com/mizcausevic-dev/mcp-tool-card-spec) | Per-tool disclosure layered on Model Context Protocol servers |

---

**Connect:** [LinkedIn](https://www.linkedin.com/in/mirzacausevic/) · [Kinetic Gain](https://kineticgain.com) · [Medium](https://medium.com/@mizcausevic/) · [Skills](https://mizcausevic.com/skills/)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/mizcausevic-dev/ai-evidence-format-spec

Awesome Lists containing this project

README