An open API service indexing awesome lists of open source software.

https://github.com/kadubon/bottleneck-audit-toolkit

Offline, fail-closed verifier for JSONL telemetry event logs. Emits deterministic audit certificates + human summaries with explicit claims/non-claims for bottleneck and integrity review.
https://github.com/kadubon/bottleneck-audit-toolkit

ai audit bottleneck checkpointing distributed-training event-logs jsonl mlops offline-verification performance-monitoring silent-data-corruption tail-latency telemetry

Last synced: 5 months ago
JSON representation

Offline, fail-closed verifier for JSONL telemetry event logs. Emits deterministic audit certificates + human summaries with explicit claims/non-claims for bottleneck and integrity review.

Awesome Lists containing this project

README

          

# Bottleneck Audit Toolkit (BATool)

## DISCLAIMER / NO SUPPORT / NO WARRANTY / NO LIABILITY

BATool is a research and audit tool. It is **not** a security boundary, safety guarantee, or compliance system.
Use is entirely at your own risk. The authors and contributors accept **no warranty, no liability, and no support obligations**.
Issues, PRs, and inquiries may be ignored or left unanswered.

BATool is an **offline, fail-closed verifier** for **JSONL event logs**. It emits:

- a deterministic, machine-readable **certificate** (JSON), and
- a deterministic, human-readable **summary** (text),

while separating **claims** (supported by the observed log) from **non-claims** (what cannot be supported, and why).

## License

- Code and tooling: **Apache-2.0** (see `LICENSE` and `NOTICE`)
- TeX sources under `paper/`: **CC-BY-4.0** (see `paper/LICENSE` and file headers)

## Quickstart (uv)

```bash
uv venv
uv pip install -e ./verifier
batool verify --input ./examples/ok_run/events.jsonl --out /tmp/certificate.json --human /tmp/summary.txt
```

## Quickstart (pip)

```bash
python -m venv .venv
. .venv/bin/activate
pip install -e ./verifier
batool verify --input ./examples/ok_run/events.jsonl --out /tmp/certificate.json --human /tmp/summary.txt
```

## Verdicts

- `VERIFIED`: no reject/undecidable triggers, and at least one claim is made.
- `UNDECIDABLE`: missing data or ambiguity (for example missing END events or clock skew beyond tolerance).
- `REJECT`: schema violations, inconsistent run_id, duplicate event_id, tamper mismatch, or strict-mode failures.

Exit codes: `0` VERIFIED, `1` UNDECIDABLE, `2` REJECT.

## Claims vs non-claims

- Claims are explicitly supported by the observed data under this tool's validation rules.
- Non-claims explain what could not be supported, and why.

This tool does **not** infer causality. It performs telemetry-visible checks and heuristic aggregation only.
It prefers `UNDECIDABLE` / `REJECT` over guessing.

## Strict mode

Use `--strict` to treat any clock decrease or missing END as `REJECT`. This is more conservative and intended for high-integrity pipelines.

## Sample output (summary excerpt)

```
BATool Verification Summary
Verdict: VERIFIED
Run ID: run-ok-001
Input Digest (sha256, KEYSORT_UTF8): 7f2b...

Claims:
- useful_compute_floor: 5 steps
- dominant_time_component: ALLREDUCE_DOMINANT (evidence=high)
- integrity: OK (contract_ok=true)

Non-claims:
- none
```

## Reference theory (DOIs)

BATool implements a **reduced log-checking protocol** and a **minimal certificate format**. It does **not** implement the full
statistical guarantees or cryptographic integrity models described in the following references.

- TLUC:
- SDC:
- MCCBE:

Not implemented (examples):

- confidence sequences or coverage-in-time proofs
- cryptographic authenticity or secure log signatures
- power/cooling estimators or hardware trust anchors

## Documentation

See `docs/` for:

- certificate format (`docs/CERTIFICATE_FORMAT.md`)
- threat model (`docs/THREAT_MODEL.md`)
- versioning (`docs/VERSIONING.md`)
- reproducibility steps (`docs/REPRODUCIBILITY.md`)