{"id":48709093,"url":"https://github.com/fabio-rovai/tardygrada","last_synced_at":"2026-04-11T13:39:07.096Z","repository":{"id":348523401,"uuid":"1197359950","full_name":"fabio-rovai/tardygrada","owner":"fabio-rovai","description":"Trust infrastructure for AI agents. Know who produced a value, when, and that it hasn't been tampered with. Zero dependencies. Pure C.","archived":false,"fork":false,"pushed_at":"2026-04-09T12:48:24.000Z","size":34258,"stargazers_count":17,"open_issues_count":0,"forks_count":1,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-11T13:38:54.888Z","etag":null,"topics":["agent-framework","ai-agents","ai-safety","byzantine-fault-tolerance","c","coq","cryptography","ed25519","formal-verification","hallucination-detection","llm","mcp","mcp-server","ontology","programming-language","verification","zero-dependencies"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fabio-rovai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-31T14:18:51.000Z","updated_at":"2026-04-09T12:48:34.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/fabio-rovai/tardygrada","commit_stats":null,"previous_names":["fabio-rovai/tardygrada"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/fabio-rovai/tardygrada","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fabio-rovai%2Ftardygrada","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fabio-rovai%2Ftardygrada/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fabio-rovai%2Ftardygrada/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fabio-rovai%2Ftardygrada/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fabio-rovai","download_url":"https://codeload.github.com/fabio-rovai/tardygrada/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fabio-rovai%2Ftardygrada/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31682953,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-11T13:07:20.380Z","status":"ssl_error","status_checked_at":"2026-04-11T13:06:47.903Z","response_time":54,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent-framework","ai-agents","ai-safety","byzantine-fault-tolerance","c","coq","cryptography","ed25519","formal-verification","hallucination-detection","llm","mcp","mcp-server","ontology","programming-language","verification","zero-dependencies"],"created_at":"2026-04-11T13:39:06.434Z","updated_at":"2026-04-11T13:39:07.082Z","avatar_url":"https://github.com/fabio-rovai.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![CI](https://github.com/fabio-rovai/tardygrada/actions/workflows/ci.yml/badge.svg)](https://github.com/fabio-rovai/tardygrada/actions/workflows/ci.yml)\n[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"tardygrada-logo.png\" alt=\"Tardygrada\" width=\"200\"\u003e\n\u003c/p\u003e\n\n\u003ch3 align=\"center\"\u003eCatch lazy agents, contradicting claims, and tampered data\u003c/h3\u003e\n\n---\n\n## Your agent says it checked three sources. Did it?\n\nYour document says \"completed on time\" on page 2 and \"delayed 3 months\" on page 7. Did anyone notice?\n\nYour scoring pipeline passed through 5 agents. Can you prove the scores weren't changed along the way?\n\n```bash\ngit clone https://github.com/fabio-rovai/tardygrada \u0026\u0026 cd tardygrada \u0026\u0026 make\n\ntardy run \"Paris is in France\"                    # VERIFIED (80%)\ntardy verify-doc report.md                        # 2 contradictions found\ntardy daemon start \u0026\u0026 tardy run \"check this\"      # persistent, remembers everything\n```\n\n---\n\n## What it does\n\n### Catches lazy agents\n\nYour agent claims it queried the knowledge base, consulted sources, and cross-checked. Tardygrada records every operation independently — like a dashcam. If the agent faked it, you'll know.\n\n| Laziness type | What it means | Caught? |\n|---|---|:-:|\n| Did nothing, produced output anyway | NoWork | Yes |\n| Skimmed instead of analyzing | ShallowWork | Yes |\n| Fabricated evidence of work | FakeProof | Yes |\n| Copied another agent's answer | CopiedWork | Yes |\n| \"Verified\" itself in a circle | CircularVerification | Yes |\n\n### Catches contradicting claims\n\n\"The project was completed on time.\" and \"The project was delayed by 3 months.\" — both sound fine alone. Together, they're a contradiction. Existing tools check claims one by one and miss this.\n\nTardygrada checks them together. Three layers:\n- Logical contradictions (direct opposites, impossible combinations)\n- Numeric contradictions (the math doesn't add up)\n- Domain contradictions (the science doesn't work)\n\n```bash\ntardy verify-doc paper.md\n# [CONFLICT] Lines 42 vs 89:\n#   \"We used no external APIs\"\n#   \"API costs totalled $2,400\"\n#   → claims no APIs but reports API costs\n```\n\n### Catches tampered data\n\nA score of 8.5 stored in a Python dict — any agent can silently change it to 9.5. In Tardygrada, values are locked by the operating system. Tampering requires breaking SHA-256 or forging an ed25519 signature.\n\n---\n\n## Get started\n\n**Just the CLI:**\n```bash\nmake                                    # builds in \u003c 3 seconds\ntardy run \"your claim here\"             # verify anything\ntardy verify-doc your-file.md           # scan for contradictions\n```\n\n**Persistent mode** (remembers between runs):\n```bash\ntardy daemon start                      # start background service\ntardy run \"claim\"                       # uses persistent knowledge base\ntardy daemon status                     # see what it knows\ntardy daemon stop                       # clean shutdown\n```\n\n**Inside Claude Code** (MCP server):\n```json\n{\n  \"mcpServers\": {\n    \"tardygrada\": {\n      \"command\": \"tardygrada\",\n      \"args\": [\"mcp-bridge\"]\n    }\n  }\n}\n```\nThen just ask: *\"verify this document for contradictions\"*\n\n**Inside Claude Code** (session monitor):\n\n```bash\n/targyactivate\n```\n\nActivates Tardygrada as a contradiction monitor for the entire session. Every claim you and Claude make is recorded in the palace memory and checked against session history. If either side contradicts itself, Tardygrada flags it. Say `targy off` to deactivate.\n\n**Inside Qwen Code** (MCP server):\n\nQwen Code uses newline-delimited JSON-RPC instead of Content-Length framing. Use the included adapter:\n\n```json\n{\n  \"mcpServers\": {\n    \"tardygrada\": {\n      \"command\": \"/bin/bash\",\n      \"args\": [\"path/to/tardygrada/hooks/targy-mcp-wrapper.sh\"]\n    }\n  }\n}\n```\n\nThis gives Qwen Code access to `verify_claim`, `verify_document`, `spawn_agent`, `read_agent`, and `daemon_status` as native MCP tools. The wrapper starts the daemon automatically if it isn't running.\n\n**Convert your existing agents:**\n```bash\ntardy terraform /path/to/crewai         # 153K lines → 53 instructions\ntardy terraform /path/to/llamaindex     # 237K lines → 15 instructions\n```\n\n---\n\n## How well does it work?\n\n### Laziness detection\n\n| | Precision | Recall | F1 |\n|---|:-:|:-:|:-:|\n| Clear cases (60 traces) | 1.00 | 1.00 | 1.00 |\n| + Adversarial (100 total) | 1.00 | 0.85 | **0.92** |\n\n100 traces total. Zero false positives. Smart copiers who change 10-15% of the text slip through (similarity below threshold) — a known limitation. No existing tool does any of this.\n\n### Contradiction and hallucination detection\n\n| Dataset | What it is | Tardygrada | Best alternative |\n|---|---|:-:|:-:|\n| Clear contradictions (125) | Designed compositional | **95%** | SelfCheck: 59% |\n| + Borderline cases (225 total) | Soft/ambiguous contradictions | **69%** | SelfCheck: 38% |\n| **AgentHallu (693 trajectories)** | Real agent hallucinations, 7 frameworks | **F1: 0.58** | DeepSeek-V3.1: 0.52 |\n| **ContraDoc (891 docs)** | Real documents, human-annotated | **F1: 0.58** | SelfCheck: 0.16 |\n| HaluEval (500 responses) | Individual factual errors | F1: 0.03 | SelfCheck: 0.32 |\n\nDetection runs in two modes: deterministic (all benchmarks use this) or LLM-enhanced for broader coverage. Typical speeds: 5.7ms/trajectory (AgentHallu), 7.5ms/document (ContraDoc), 0.015ms/case (synthetic).\n\nOn ContraDoc (891 real documents) — **F1 0.58**, up from 0.16 after fixing a bug where the benchmark accidentally used the SelfCheck baseline instead of proper triple checking. Recall jumped from 9.1% to 64.8%.\n\nOn AgentHallu (693 real agent trajectories) — **F1 0.58**, beats DeepSeek-V3.1 (0.52). GPT-5 gets 0.70 but costs per-trajectory API calls.\n\nHaluEval (individual factual errors) — F1 0.03. Expected: our pipeline catches contradictions between claims, not individual factual mistakes. SelfCheck does better here (0.32) because its loose heuristics accidentally catch some errors.\n\n\u003e **What runs where:** Contradiction detection (verify-doc, all benchmarks) uses the internal decomposition + consistency + numeric layers — no external calls. Claim grounding (`tardy run \"claim\"`) optionally connects to [open-ontologies](https://github.com/fabio-rovai/open-ontologies) for OWL reasoning, or uses the built-in Datalog engine. Different features, different paths.\n\n\u003cdetails\u003e\n\u003csummary\u003eAgentHallu per-category recall\u003c/summary\u003e\n\n| Category | Recall |\n|---|:-:|\n| Reasoning | 68% |\n| Planning | 66% |\n| Retrieval | 59% |\n| Human-Interaction | 53% |\n| Tool-Use | 21% |\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003eDetailed breakdown (clear cases)\u003c/summary\u003e\n\n| Difficulty | Detection |\n|---|:-:|\n| Easy (direct opposites) | 100% |\n| Medium (logical) | 100% |\n| Hard (math/physics) | 96% |\n| Subtle (domain knowledge) | 92% |\n| Very subtle (statistical) | 88% |\n\n\u003c/details\u003e\n\n### Scaling\n\n| Agents | Time |\n|-------:|-----:|\n| 5 | 0.6 ms |\n| 500 | 21 ms |\n| 5,000 | 97 ms |\n\n---\n\n## Under the hood\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eHow verification works\u003c/b\u003e\u003c/summary\u003e\n\n```mermaid\ngraph LR\n    subgraph Pipeline[\"Verification Pipeline\"]\n        direction LR\n        C[\"Claim\"] --\u003e D[\"Decompose\"]\n        D --\u003e G[\"Ground\"]\n        G --\u003e CON[\"Consistency\"]\n        CON --\u003e P[\"Probabilistic\"]\n        P --\u003e PR[\"Protocol\"]\n        PR --\u003e F[\"Certification\"]\n        F --\u003e CR[\"Cross-Rep\"]\n        CR --\u003e W[\"Work Verify\"]\n        W --\u003e V{\"VERIFIED /\u003cbr\u003eCONFLICT /\u003cbr\u003eUNVERIFIABLE\"}\n    end\n\n    style Pipeline fill:transparent\n```\n\nClaims are decomposed into triples, grounded against a knowledge base, checked for consistency, scored probabilistically, and verified for work integrity. Eight layers, all deterministic.\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eHow tamper protection works\u003c/b\u003e\u003c/summary\u003e\n\n```mermaid\ngraph LR\n    subgraph Trust[\"Protection Levels\"]\n        direction LR\n        MUT[\"Mutable\"] --\u003e DEF[\"Default\u003cbr\u003e(OS-locked)\"]\n        DEF --\u003e VER[\"Verified\u003cbr\u003e(+ SHA-256)\"]\n        VER --\u003e HARD[\"Hardened\u003cbr\u003e(+ replicas)\"]\n        HARD --\u003e SOV[\"Sovereign\u003cbr\u003e(+ ed25519 + BFT)\"]\n    end\n\n    style Trust fill:transparent\n```\n\nValues are protected at the operating system level. The OS kernel enforces read-only memory. SHA-256 hashes detect any change. Ed25519 signatures prove authorship. BFT consensus requires corrupting multiple independent replicas.\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eHow the daemon works\u003c/b\u003e\u003c/summary\u003e\n\n```mermaid\ngraph TB\n    subgraph visible[\"What you see\"]\n        USER[\"You\"] --\u003e CLI[\"tardy run / verify-doc\"]\n    end\n\n    subgraph hidden[\"What happens\"]\n        CLI --\u003e DAEMON[\"Persistent daemon\"]\n        DAEMON --\u003e AGENTS[\"Living agents\"]\n        DAEMON --\u003e KB[\"Growing knowledge base\"]\n        DAEMON --\u003e VERIFY[\"Verification pipeline\"]\n    end\n\n    style visible fill:transparent\n    style hidden fill:transparent\n```\n\nThe daemon keeps agents alive between commands. The knowledge base grows as verified claims accumulate. Sovereign agents persist to disk on shutdown and reload on restart.\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eArchitecture\u003c/b\u003e\u003c/summary\u003e\n\n```mermaid\ngraph TB\n    subgraph Tardygrada[\"Tardygrada\"]\n        CLI_CMD[\"CLI\"] --\u003e DAEMON_S[\"Daemon\"]\n        DAEMON_S --\u003e VM[\"VM Core\"]\n        VM --\u003e VERIFY_S[\"Verification\"]\n        VM --\u003e ONTO[\"Knowledge Base\"]\n        VM --\u003e CRYPTO_S[\"Cryptography\"]\n        VERIFY_S --\u003e DECOMP_S[\"Decompose\"]\n        VERIFY_S --\u003e NUMERIC_S[\"Numeric Check\"]\n        VERIFY_S --\u003e DOMAIN_S[\"Domain Check\"]\n        VERIFY_S --\u003e WORK_S[\"Work Verify\"]\n    end\n\n    subgraph External[\"Optional integrations\"]\n        BITF[\"brain-in-the-fish\u003cbr\u003e(multi-agent debate)\"]\n        OO[\"open-ontologies\u003cbr\u003e(OWL reasoning)\"]\n    end\n\n    VM -- \"coordinate\" --\u003e BITF\n    VM -- \"grounded_in\" --\u003e OO\n\n    style Tardygrada fill:transparent\n    style External fill:transparent\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eThe language (for power users)\u003c/b\u003e\u003c/summary\u003e\n\n```\nagent MedicalAdvisor @sovereign @semantics(truth.min_confidence: 0.99) {\n    invariant(trust_min: @verified)\n    let diagnosis: Fact = receive(\"symptom analysis\") grounded_in(medical) @verified\n    let data: str = exec(\"sqlite3 patients.db 'SELECT * FROM current'\")\n    coordinate {analyzer, validator} on(\"verify diagnosis\") consensus(ProofWeight)\n}\n```\n\nEvery value is an agent. Programs compile to servers. `receive()` accepts claims from external systems. `@sovereign` means the value is cryptographically signed and replicated. `coordinate` dispatches to multi-agent debate.\n\nYou don't need to learn this to use Tardygrada. The CLI and daemon handle everything.\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eReproduce all evaluations\u003c/b\u003e\u003c/summary\u003e\n\n```bash\ncd evaluation \u0026\u0026 make\n./laziness_bench           # 60 traces, F1 1.00\n./hallucination_bench      # 500 cases, 95% compositional\n./scaling_bench            # 5→5000 agents, linear\n./ablation_bench           # layer-by-layer analysis\n./contradoc_bench          # 891 real documents (external)\n./halueval_bench           # 500 HaluEval examples (external)\n```\n\n\u003c/details\u003e\n\n---\n\n## Research\n\nBuilt on: [AgentSpec](https://arxiv.org/abs/2503.18666) (ICSE 2026), [Bythos](https://arxiv.org/abs/2302.01527) (Coq BFT), Minsky frames (1974), CRDTs (Shapiro 2011), Datalog (1986).\n\nEvaluated against: [SelfCheckGPT](https://arxiv.org/abs/2303.08896) (EMNLP 2023), [FActScore](https://aclanthology.org/2023.emnlp-main.741/) (EMNLP 2023), [ContraDoc](https://aclanthology.org/2024.naacl-long.362/) (NAACL 2024), [HaluEval](https://huggingface.co/datasets/pminervini/HaluEval).\n\nRelated: [Mundler et al.](https://arxiv.org/abs/2305.15852) (ICLR 2024), [Fang et al.](https://arxiv.org/abs/2409.11283) (AAAI 2025), [He et al.](https://arxiv.org/abs/2601.13600) (2026).\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffabio-rovai%2Ftardygrada","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffabio-rovai%2Ftardygrada","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffabio-rovai%2Ftardygrada/lists"}