https://github.com/edycutjong/gauntlet
π§€ Paid certification agent β hires your agent with 7 adversarial probes and delivers a scorecard
https://github.com/edycutjong/gauntlet
a2a agent certification croo testing
Last synced: 14 days ago
JSON representation
π§€ Paid certification agent β hires your agent with 7 adversarial probes and delivers a scorecard
- Host: GitHub
- URL: https://github.com/edycutjong/gauntlet
- Owner: edycutjong
- License: mit
- Created: 2026-06-13T12:29:51.000Z (22 days ago)
- Default Branch: main
- Last Pushed: 2026-06-14T02:26:14.000Z (21 days ago)
- Last Synced: 2026-06-14T03:14:14.063Z (21 days ago)
- Topics: a2a, agent, certification, croo, testing
- Language: TypeScript
- Homepage:
- Size: 8.08 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Security: SECURITY.md
Awesome Lists containing this project
README
Gauntlet π§€
Paid certification agent β hires your agent with 7 adversarial probes and delivers a scorecard
[](https://mock.croo.network)
[](https://dorahacks.io/hackathon/croo-hackathon)


[](https://github.com/edycutjong/gauntlet/actions/workflows/ci.yml)
---
## πΈ See it in Action
> **The Certification Workflow.** Agent Submitted β Gauntlet Pays Agent β Runs 7 Adversarial Probes β Collects Responses β Generates Scorecard PDF.
---
## π‘ The Problem & Solution
How do you know if an AI agent is safe, secure, and performs as advertised before giving it sensitive access?
**Gauntlet** is a Paid Certification Agent. It acts as an automated red-team for AI agents. You submit an agent to Gauntlet, it pays the agent to execute a series of tasks, but secretly injects 7 adversarial probes (prompt injection, hallucination testing, data extraction). Based on how the agent responds, Gauntlet generates a certified security scorecard.
**Key Features:**
- π‘οΈ **Adversarial Probing:** Tests agents against 7 distinct attack vectors and failure modes.
- πΈ **Real-World Execution:** Actually hires and pays the target agent to test it in a live environment.
- π **Scorecard Generation:** Delivers a comprehensive PDF scorecard detailing vulnerabilities and a final certification grade.
## π The Constellation β On-Chain A2A Graph
Gauntlet is a **reputation primitive**: it pays real CAP orders to the agent under test, then issues an escrow-backed, on-chain certified scorecard + README badge. It can cross-certify every other agent in the constellation β a kind of A2A relationship impossible on a flat marketplace (no escrow, no refund-on-failure, no verifiable on-chain provenance).
```mermaid
graph LR
User([Any Agent / User]) -->|hires to certify| G[Gauntlet π§€]
G -->|7 paid adversarial probes| T[Target Agent]
G -.->|cross-certifies| M[Maestro πΌ]
G -.->|cross-certifies| L[Litmus π§ͺ]
G -.->|cross-certifies| S[Summon π€]
classDef hot fill:#F59E0B,stroke:#111,color:#111,font-weight:bold;
class G hot;
```
- **Diversity:** `npm run certify` cross-certifies multiple constellation agents in one run β many distinct A2A edges.
- **Escrow integrity:** if a probe run can't complete, the buyer's escrow is refunded rather than charged for a partial scorecard.
## π Live Run Log β On-Chain Proof (Base Mainnet)
Real CAP orders settled in USDC during the hackathon. Gauntlet **pays** the target agent (probes) *and* is **paid** to deliver the certification β so each run adds rows on both sides.
**Total real CAP orders: _0_** Β· _last updated: 2026-06-__
| # | Date | Role | Counterparty | Amount (USDC) | Order ID | Tx (BaseScan) | Result |
|---|------|------|--------------|---------------|----------|---------------|--------|
| 1 | _2026-06-__ | Provider (paid) | _requester_ | _0.00_ | `_ord_β¦_` | [0xβ¦](https://basescan.org/tx/0xβ¦) | scorecard _N_/100 |
| 2 | _2026-06-__ | Requester (probe) | _target agent_ | _0.00_ | `_ord_β¦_` | [0xβ¦](https://basescan.org/tx/0xβ¦) | probe pass/fail |
> `npm run certify` against live targets prints the order IDs + tx hashes; they're also in the CROO dashboard. Delete this note once populated.
## ποΈ Architecture & Tech Stack
| Layer | Technology |
|---|---|
| **Runtime** | Node.js (TypeScript) |
| **Ecosystem** | Constellation A2A (croo-core) |
| **PDF Generation** | PDFKit |
| **Testing** | Vitest |
## π Getting Started
### Prerequisites
- Node.js β₯ 20
- npm
### Installation
1. Clone: `git clone https://github.com/edycutjong/gauntlet.git`
2. Install: `npm install`
3. Configure: `cp .env.example .env.local` and fill in your service ID (skip for mock mode)
### βΆοΈ Run it now β offline mock mode (no wallet, no USDC)
```bash
npm install
npm run certify # cross-certifies the constellation, end-to-end
# β¦or boot the provider + badge server in mock mode:
CROO_MOCK=true npm run dev
```
With the provider running, the live certification badge is served at
`http://localhost:8080/badge?serviceId=` β drop it straight into any README.
## π§ͺ Testing & CI
**4-stage pipeline:** Quality β Security β Build β Deploy Gate
```bash
# ββ Code Quality ββββββββββββββββββββββββββββ
make lint # ESLint
make typecheck # TypeScript check
make test # Run tests
make test-coverage # Coverage report
make ci # Full quality gate
# ββ Security ββββββββββββββββββββββββββββββββ
make security-scan # npm audit + license check
```
| Layer | Tool | Status |
|---|---|---|
| Code Quality | ESLint + TypeScript | β
|
| Unit Testing | Vitest | β
|
| Security (SAST) | CodeQL | β
|
| Security (SCA) | Dependabot + npm audit | β
|
| Secret Scanning | TruffleHog | β
|
## π Project Structure
```text
dorahacks-croo-gauntlet/
βββ docs/ # README assets (hero, screenshots)
βββ src/ # Application source code
βββ scripts/ # Build and run scripts
βββ __tests__/ # Vitest test suites
βββ .github/ # CI workflows
βββ README.md # You are here
```
## π’ Deploy
Containerized **web service** with a built-in health check on `/health` and the badge endpoint on `$PORT` (default 8080):
```bash
docker build -t gauntlet .
docker run -p 8080:8080 --env-file .env.local gauntlet
# Health: http://localhost:8080/health
# Badge: http://localhost:8080/badge?serviceId=
```
## π License
[MIT](LICENSE) Β© 2026 Edy Cu
## π Acknowledgments
Built for the DoraHacks CROO Hackathon 2026.