An open API service indexing awesome lists of open source software.

https://github.com/edycutjong/litmus

๐Ÿงช Output-grading quality gate agent โ€” grades any deliverable 0-100 with a rubric, on-chain
https://github.com/edycutjong/litmus

a2a agent croo grading quality

Last synced: 3 days ago
JSON representation

๐Ÿงช Output-grading quality gate agent โ€” grades any deliverable 0-100 with a rubric, on-chain

Awesome Lists containing this project

README

          


Litmus Logo

Litmus ๐Ÿงช


Output-grading quality gate agent โ€” grades any deliverable 0-100 with a rubric, on-chain


Litmus


[![Live Demo](https://img.shields.io/badge/๐Ÿš€_Live-Demo-06b6d4?style=for-the-badge)](https://mock.croo.network)
[![Built for CROO Hackathon](https://img.shields.io/badge/DoraHacks-CROO_Hackathon_2026-8b5cf6?style=for-the-badge)](https://dorahacks.io)


![TypeScript](https://img.shields.io/badge/TypeScript-3178C6?style=flat&logo=typescript&logoColor=white)
![Node.js](https://img.shields.io/badge/Node.js-339933?style=flat&logo=node.js&logoColor=white)
[![CI](https://github.com/edycutjong/litmus/actions/workflows/ci.yml/badge.svg)](https://github.com/edycutjong/litmus/actions/workflows/ci.yml)

---

## ๐Ÿ“ธ See it in Action


Litmus Demo

> **The Quality Gate Workflow.** Deliverable Received โ†’ Litmus Applies Grading Rubric โ†’ Score (0-100) Calculated โ†’ Feedback & On-Chain Grade Delivered.

---

## ๐Ÿ’ก The Problem & Solution
In an autonomous agent economy, output quality varies wildly. How do you trust an agent's work without manual human review?
**Litmus** is an AI Quality Gate Agent. It acts as an automated, impartial grader that evaluates deliverables against strict, predefined rubrics. If an agent submits subpar code, writing, or analysis, Litmus rejects it, ensuring only high-quality work passes the gate.

**Key Features:**
- โš–๏ธ **Objective Grading:** Evaluates work across multiple rubric categories, assigning a deterministic score from 0-100.
- ๐Ÿšง **Quality Gatekeeper:** Automatically rejects work that falls below the acceptable threshold.
- โ›“๏ธ **On-Chain Attestation:** Cryptographically signs the grade to ensure the evaluation is immutable and verifiable.

## ๐ŸŒŒ The Constellation โ€” On-Chain A2A Graph

Litmus is the constellation's **quality oracle**: other agents pay it on-chain to grade a deliverable 0โ€“100 against a rubric. A two-model "tribunal" (with a tiebreaker) keeps scoring stable (ฯƒ < 4). Verifiable, paid, impartial grading-as-a-service is a primitive a normal API marketplace can't offer.

```mermaid
graph LR
User([Any Agent / User]) -->|hires to grade| L[Litmus ๐Ÿงช]
M[Maestro ๐ŸŽผ] -->|grade + re-grade in its reflection loop| L
G[Gauntlet ๐Ÿงค] -.->|certifies| L
classDef hot fill:#F59E0B,stroke:#111,color:#111,font-weight:bold;
class L hot;
```

- **Depth:** Maestro hires Litmus **twice** per pipeline โ€” once to grade, once to re-grade the self-corrected draft โ€” making it a high-traffic A2A node.
- **Anti-gaming:** rubric weights are validated and Format/Clarity is capped at 15% so agents can't farm a passing grade on style alone.

## ๐Ÿ”— Live Run Log โ€” On-Chain Proof (Base Mainnet)

Real CAP grading orders Litmus fulfilled as a **provider**.

**Total real CAP orders: _0_** ยท _last updated: 2026-06-__

| # | Date | Counterparty (requester) | Amount (USDC) | Order ID | Tx (BaseScan) | Score |
|---|------|--------------------------|---------------|----------|---------------|-------|
| 1 | _2026-06-__ | _Maestro / external_ | _0.00_ | `_ord_โ€ฆ_` | [0xโ€ฆ](https://basescan.org/tx/0xโ€ฆ) | _N_/100 |

> Order IDs + pay tx are in the provider logs and the CROO dashboard. Delete this note once populated.

## ๐Ÿ—๏ธ Architecture & Tech Stack

| Layer | Technology |
|---|---|
| **Runtime** | Node.js (TypeScript) |
| **Ecosystem** | Constellation A2A (croo-core) |
| **Testing** | Vitest |

## ๐Ÿš€ Getting Started

### Prerequisites
- Node.js โ‰ฅ 20
- npm

### Installation
1. Clone: `git clone https://github.com/edycutjong/litmus.git`
2. Install: `npm install`
3. Configure: `cp .env.example .env.local` and fill in your service ID + an LLM key (skip for mock mode)

### โ–ถ๏ธ Run it now โ€” offline mock mode (no wallet, no USDC)
```bash
npm install
CROO_MOCK=true npm run dev # boots the grader provider with no on-chain calls
```
Grading works with **no API key** (deterministic mock grade); set `OPENAI_API_KEY` and/or `ANTHROPIC_API_KEY` to enable the live LLM tribunal. Run `npm run stability` to reproduce the ฯƒ < 4 scoring-variance harness.

## ๐Ÿงช Testing & CI

**4-stage pipeline:** Quality โ†’ Security โ†’ Build โ†’ Deploy Gate

```bash
# โ”€โ”€ Code Quality โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
make lint # ESLint
make typecheck # TypeScript check
make test # Run tests
make test-coverage # Coverage report
make ci # Full quality gate

# โ”€โ”€ Security โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
make security-scan # npm audit + license check
```

| Layer | Tool | Status |
|---|---|---|
| Code Quality | ESLint + TypeScript | โœ… |
| Unit Testing | Vitest | โœ… |
| Security (SAST) | CodeQL | โœ… |
| Security (SCA) | Dependabot + npm audit | โœ… |
| Secret Scanning | TruffleHog | โœ… |

## ๐Ÿ“ Project Structure
```text
dorahacks-croo-litmus/
โ”œโ”€โ”€ docs/ # README assets (hero, screenshots)
โ”œโ”€โ”€ src/ # Application source code
โ”œโ”€โ”€ scripts/ # Build and run scripts
โ”œโ”€โ”€ __tests__/ # Vitest test suites
โ”œโ”€โ”€ .github/ # CI workflows
โ””โ”€โ”€ README.md # You are here
```

## ๐Ÿšข Deploy
Containerized for any PaaS. Litmus is a background **worker** (connects out to the CROO WebSocket โ€” no inbound port):
```bash
docker build -t litmus .
docker run --env-file .env.local litmus
```

## ๐Ÿ“„ License
[MIT](LICENSE) ยฉ 2026 Edy Cu

## ๐Ÿ™ Acknowledgments
Built for the DoraHacks CROO Hackathon 2026.