An open API service indexing awesome lists of open source software.

https://github.com/igorganapolsky/thumbgate

Agent governance for ThumbGate: ๐Ÿ‘/๐Ÿ‘Ž become Pre-Action Checks that block repeat mistakes before code, money, or customer systems change.
https://github.com/igorganapolsky/thumbgate

agent-reliability ai-agents ai-cost-optimization ai-safety amp claude-code codex cursor developer-tools feedback-loop gemini guardrails mcp mcp-server opencode pre-action-checks reduce-llm-cost save-llm-tokens thompson-sampling thumbgate

Last synced: 1 day ago
JSON representation

Agent governance for ThumbGate: ๐Ÿ‘/๐Ÿ‘Ž become Pre-Action Checks that block repeat mistakes before code, money, or customer systems change.

Awesome Lists containing this project

README

          

# ThumbGate



ThumbGate

**AI agents repeat mistakes. In regulated industries, one wrong action is a liability event.**

ThumbGate is deterministic pre-action governance for AI agents. From developer workflows to legal intake to financial compliance โ€” one rule blocks unauthorized actions before they execute, across every session, every agent, every model.

The product is a self-improving enforcement layer: thumbs-down feedback, prompt evaluation, and proof from prior runs become prevention rules that permanently stop repeated failures before the next tool call.

```
Agent tries: rm -rf tests/
ThumbGate: โ›” BLOCKED โ€” "Never delete test directories"
Pattern matched: rm.*-rf.*tests
Source: your thumbs-down from last Tuesday
Tokens spent on this repeat: 0
```

```bash
npx thumbgate init # auto-detects your agent, wires hooks, 30 seconds
```

Works with **Claude Code, Cursor, Codex, Gemini CLI, Amp, Cline, OpenCode** and any MCP-compatible agent.

### Add ThumbGate to Claude (remote connector, 30 seconds, no install)

ThumbGate is a hosted remote MCP server. To use it in **Claude.ai / Claude Desktop**:
**Settings โ†’ Connectors โ†’ Add custom connector**, then paste:

```
https://thumbgate.ai/mcp
```

That's it โ€” Claude can now call ThumbGate's gate-check and feedback tools directly.
For local/CLI agents (Claude Code, Cursor, Codex, โ€ฆ) use `npx thumbgate init`, which
auto-wires the hooks. (The same server is published to the [MCP Registry](https://registry.modelcontextprotocol.io) as `io.github.IgorGanapolsky/thumbgate`.)

**Free:** 5 feedback captures/day (25 total captures), 3 active auto-promoted prevention rules, all MCP integrations, local-first.
**[Pro โ€” $19/mo or $149/yr](https://thumbgate.ai/checkout/pro?utm_source=github&utm_medium=readme):** no limits on captures or rules, history-aware lessons, feedback sessions, hosted dashboard, DPO export.
**Team โ€” $49/seat/mo:** shared hosted lesson DB, org dashboard, approval boundaries.

[![CI](https://github.com/IgorGanapolsky/ThumbGate/actions/workflows/ci.yml/badge.svg)](https://github.com/IgorGanapolsky/ThumbGate/actions/workflows/ci.yml)
[![npm](https://img.shields.io/npm/v/thumbgate)](https://www.npmjs.com/package/thumbgate)
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)

---

> **Visibility isn't trust.** A dashboard shows you what an agent did; it doesn't stop the agent from doing it again. ThumbGate is the enforcement layer: PreToolUse gates, thumbs-down โ†’ rule, and an audit trail on every interception โ€” so a mistake gets blocked, not just logged.
>
> Published in the [MCP Registry](https://registry.modelcontextprotocol.io) (`io.github.IgorGanapolsky/thumbgate`) and usable as a one-line Claude connector.

---

## Agentic development cycle fit

Agentic development is becoming a loop: **Guide โ†’ Generate โ†’ Verify โ†’ Solve**. ThumbGate gives that loop a hard execution boundary.

- **Guide:** standards, prior thumbs-downs, and approval policies become concrete context.
- **Generate:** Claude Code, Cursor, Codex, Gemini, Amp, Cline, OpenCode, and MCP agents keep producing plans and tool calls.
- **Verify:** risky actions need evidence before execution, not just after PR review.
- **Solve:** blocked failures become reusable lessons, shared prevention rules, DPO exports, and audit events.

In that stack, ThumbGate is the pre-action gate between generated intent and executed action.

---

## ๐ŸŽฌ 90-second demo

Watch the force-push scenario: agent tries to `git push --force`, one thumbs-down, next session it's blocked โ€” zero tokens spent on the repeat.

[**โ–ถ Watch the 90-second demo**](https://thumbgate.ai/#demo?utm_source=github&utm_medium=readme&utm_campaign=demo_video) ยท [Script](docs/marketing/demo-video-script.md) ยท [ElevenLabs narration: `npm run demo:voiceover`](scripts/generate-demo-voiceover.js)

---

## First-dollar activation path

If someone is not already bought into ThumbGate, do not lead with architecture. Lead with one repeated mistake.

1. **Show the pain:** open the **[ThumbGate GPT](https://thumbgate.ai/go/gpt?utm_source=github&utm_medium=readme&utm_campaign=first_dollar_activation&cta_id=readme_first_dollar_open_gpt&cta_placement=readme_first_dollar)** and paste the bad answer, risky command, deploy, PR action, or agent plan before it runs again.
2. **Capture the lesson:** type `thumbs down:` or `thumbs up:` with one concrete sentence. Native ChatGPT rating buttons are not the ThumbGate capture path; typed feedback is.
3. **Enforce the repeat:** run `npx thumbgate init` where the agent executes so the lesson can become one of your Pre-Action Checks instead of another reminder.
4. **Upgrade only after proof:** Solo Pro is for the dashboard, DPO export, proof-ready evidence, and higher capture limits after one real blocked repeat. Team starts with the Workflow Hardening Sprint around one repeated failure, one owner, and one proof review.

The buying question is simple: **what repeated AI mistake would be worth blocking before the next tool call?**

---

## The Problem โ€” the bill nobody talks about

Frontier-model calls are not cheap. Sonnet 4.5 is ~$3 / 1M input tokens and ~$15 / 1M output tokens. Opus is 5ร— that. Every time your agent:

- hallucinates a function name and you have to correct it,
- retries the same failing tool call until it gives up,
- regenerates a 4,000-token plan you already approved last session,
- repeats a destructive command you blocked manually yesterday,

โ€ฆyou are paying for that round-trip. *Twice if it retries. Three times if you re-prompt.* And the agent has no memory across sessions, so the meter resets every Monday.

```
Session 1: Agent force-pushes to main. You fix it. +4,200 tokens
Session 2: Agent force-pushes again. You fix it. +4,200 tokens
Session 3: Same mistake. Again. You lose 45m. +5,800 tokens
```

That's ~$0.21 in tokens just to fix the same mistake three times โ€” multiplied by every developer, every repeated-mistake class, every week. The math gets ugly fast.

## The Solution โ€” fix it once, the bill never sees it again

```
Session 1: Agent force-pushes to main. You ๐Ÿ‘Ž it. +4,200 tokens
Session 2: โ›” Check blocks the force-push. Zero round-trip. +0 tokens
Session 3+: Never happens again. +0 tokens
```

One thumbs-down. The PreToolUse hook intercepts the call **before** it reaches the model โ€” no input tokens, no output tokens, no retry loop. The dashboard tracks **tokens saved this week** as a live counter so you can see exactly what your prevention rules are worth. Mark a review checkpoint once, and the dashboard narrows the next pass to only the feedback, lessons, and check blocks that landed since your last review.

ThumbGate doesn't make your agent smarter. It makes your agent *cheaper to be wrong with.*

---

## Quick Start

```bash
npx thumbgate init # auto-detects your agent, wires everything
npx thumbgate capture --feedback=down --context="Never run DROP on production tables"
```

That single command creates a prevention rule. Next time any AI agent tries to run `DROP` on production:

```
โ›” Check blocked: "Never run DROP on production tables"
Pattern: DROP.*production
Verdict: BLOCK
```

---

## Architecture

ThumbGate operates as a 4-layer enforcement stack between your AI agent and your codebase:

![ThumbGate Architecture](docs/diagrams/thumbgate_architecture.png)

### Layer 1: Feedback Capture
Your thumbs-up/down reactions are captured via MCP protocol, CLI, or the ChatGPT GPT surface. Each reaction is stored as a structured lesson with context, timestamp, and severity.

### Layer 2: Check Engine
The check engine converts lessons into enforceable rules. **The runtime gate decision is deterministic** โ€” literal pattern match โ†’ AST match โ†’ scoped rule lookup. No LLM call on the enforcement path.

Where retrieval is needed (an agent is about to run a destructive command not on the literal block list, but semantically similar to one we've blocked before), ThumbGate uses local CPU-only `bge-small` embeddings via LanceDB's built-in pipeline. No external API call, no inference cost beyond CPU. So **"no LLM in enforcement"** holds: the gate decision uses no LLM; the rule corpus is just searchable via local embeddings.

**Thompson Sampling tunes per-rule confidence weights** for soft-gating rules so high-noise rules quiet down and high-signal rules sharpen. It never decides *whether* a rule fires โ€” a hard rule like "block `git push --force` on main" always fires deterministically. Bandit exploration would be terrifying for hard rules; we don't do it.

Rules stay in local ThumbGate runtime state.

### Layer 3: Pre-Action Interception
Before any agent action executes, ThumbGate's `PreToolUse` hook intercepts the command and evaluates it against all active checks. This happens at the MCP protocol level โ€” the agent physically cannot bypass it.

### Layer 4: Multi-Agent Distribution (the actual moat vs hand-rolled hooks)
Claude Code already ships `permissions.deny` and `PreToolUse` hooks. Cursor and Codex have their own. So why ThumbGate over a hand-written hook?

Two things hand-written hooks structurally cannot do:

1. **Cross-agent propagation.** A `permissions.deny` pattern lives in one agent's config and stays there. ThumbGate's checks distribute across every connected agent over MCP stdio โ€” thumbs-down once in Cursor, the same pattern blocks on Claude Code, Codex, Gemini CLI, Cline, OpenCode, Amp in the next session, no copy-paste between configs.
2. **Learning loop.** A hand-written hook covers exactly the patterns you wrote. ThumbGate promotes every thumbs-down into a fresh rule, tunes existing rules' confidence weights from outcomes (Thompson Sampling, see Layer 2), and pulls semantically-near patterns into scope via local embeddings. The rule corpus sharpens without an operator hand-writing a regex for every new mistake shape.

Hand-rolled hooks are the right tool for a small, static denylist you maintain by hand. ThumbGate is the right tool when you want corrections from any agent to harden every agent automatically.

Prompt engineering still matters, but it is only the starting point. ThumbGate adds prompt evaluation on top: proof lanes, benchmarks, and self-heal checks tell you whether your prompt and workflow actually held up under execution instead of leaving you to guess from vibes. Run `npx thumbgate eval --from-feedback --write-report=.thumbgate/prompt-eval-proof.md` to turn real thumbs-up/down feedback into reusable eval cases and a buyer-ready proof report.

### Managed model benchmark lane

When a new managed model drops, do not swap ThumbGate over on vendor claims alone. Rank it against the actual ThumbGate workload first:

```bash
npx thumbgate model-candidates --workload=pretool-gating --json
npx thumbgate model-candidates --workload=long-trace-review --provider=openai-compatible --gateway=tinker --json
```

The catalog currently includes the April 23, 2026 Tinker additions:

- `tinker/qwen3.6-35b-a3b` for pre-action gating, agentic coding, and tool-use
- `tinker/qwen3.6-27b` for the cheap fast-path
- `tinker/kimi-k2.6-128k` for long-trace review and multi-agent sessions

Each recommendation ships with the benchmark commands to run next: feedback-derived prompt eval, `gate-eval`, and `thumbgate bench`. For whole-repo clone claims, add `npx thumbgate bench --programbench-smoke` to generate a ProgramBench-style cleanroom proof report without claiming an official ProgramBench score. That keeps model selection evidence-backed instead of hype-driven.

![Feedback Pipeline](docs/diagrams/feedback_pipeline.png)

![Agent Integration](docs/diagrams/agent_integration.png)

---

## Install for Your Agent

| Agent | Command |
|-------|---------|
| **Claude Code** | `npx thumbgate init --agent claude-code` |
| **Cursor** | `npx thumbgate init --agent cursor` |
| **VS Code / Open VSX** | [plugins/vscode-extension/README.md](plugins/vscode-extension/README.md) |
| **Antigravity-compatible** | [plugins/antigravity-extension/INSTALL.md](plugins/antigravity-extension/INSTALL.md) |
| **JetBrains** | [plugins/jetbrains-plugin/README.md](plugins/jetbrains-plugin/README.md) |
| **Codex** | `npx thumbgate init --agent codex` |
| **Gemini CLI** | `npx thumbgate init --agent gemini` |
| **Amp** | `npx thumbgate init --agent amp` |
| **Cline** (Roo Code successor) | `npx thumbgate init --agent cline` |
| **Claude Desktop** | [Download extension bundle](https://github.com/IgorGanapolsky/ThumbGate/releases/latest/download/thumbgate-claude-desktop.mcpb) |
| **Any MCP agent** | `npx thumbgate serve` |

Works with **Claude Code, Cursor, Codex, Gemini CLI, Amp, Cline, OpenCode**, and any MCP-compatible agent. Migrating from Roo Code (sunsetting 2026-05-15)? See [`adapters/cline/INSTALL.md`](./adapters/cline/INSTALL.md).

### Install scope: machine-wide vs per-project

ThumbGate supports two install scopes. Pick once when you install โ€” you can switch later by re-running with the other flag.

| Scope | Command | Settings file | Lesson DB + dashboard live in | When to use |
|-------|---------|---------------|--------------------------------|-------------|
| **Machine-wide** (default) | `npx thumbgate init` | `~/.claude/settings.json` | `~/.claude/memory/feedback/` | Solo dev โ€” **one shared dashboard across every repo on this machine**. A lesson learned in `repo-A` blocks the same mistake in `repo-B` automatically. |
| **Per-project** | `npx thumbgate init --project` (in the repo root) | `/.claude/settings.json` | `/.claude/memory/feedback/` | Client work, compliance, or multi-tenant โ€” **separate dashboard per repo**, lessons stay isolated, audit trail belongs to the repo. |

Both scopes write `mcpServers.thumbgate` + the PreToolUse / UserPromptSubmit / PostToolUse / SessionStart hooks; the only difference is *where*. Machine-wide is the right default for most developers. Switch to `--project` only when you have a reason to keep lessons from bleeding between repos.

> Per-project lesson DBs live under each repo's `.claude/memory/feedback/` and **must stay gitignored** โ€” they're a runtime store, not source. ThumbGate's bundled `.gitignore` template handles this.

### Status bar proof

![Claude Code ThumbGate footer](public/assets/claude-thumbgate-statusbar.svg)

![Codex ThumbGate test lane](public/assets/codex-thumbgate-statusbar-test.svg)

Claude renders the live ThumbGate footer today. `npx thumbgate init --agent codex` now installs the full Codex hook bundle and writes the ThumbGate `statusLine` target into `~/.codex/config.json` so you can test it on your local Codex build immediately.

### Install Codex Plugin

Open the Codex plugin install page or download the standalone bundle from GitHub Releases. The Codex launcher resolves `thumbgate@latest` when MCP and hooks start, so published npm fixes reach active Codex installs without hand-editing `~/.codex/config.toml`.

1. Install page: [thumbgate.ai/codex-plugin](https://thumbgate.ai/codex-plugin)
2. Direct zip: [thumbgate-codex-plugin.zip](https://github.com/IgorGanapolsky/ThumbGate/releases/latest/download/thumbgate-codex-plugin.zip)
3. Follow: [plugins/codex-profile/INSTALL.md](plugins/codex-profile/INSTALL.md)

---

## How It Works

```
STEP 1 STEP 2 STEP 3
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€

You react ThumbGate learns The check holds

๐Ÿ‘Ž on a bad โ”€โ”€โ–บ Feedback becomes โ”€โ”€โ–บ Next time the
agent action a saved lesson agent tries the
and a block rule same thing:
๐Ÿ‘ on a good โ”€โ”€โ–บ Good pattern gets โ›” BLOCKED
agent action reinforced (or โœ… allowed)
```

No manual rule-writing. No config files. Your reactions teach the agent what your team actually wants.

---

ThumbGate sells three concrete outcomes:

- **Prevent expensive AI mistakes** โ€” catch bad commands, destructive database actions, unsafe publishes, and risky API calls before they run.
- **Make AI stop repeating mistakes** โ€” fix it once, turn the lesson into a rule, and block the repeat before the next tool call lands.
- **Turn AI into a reliable operator** โ€” move from a smart assistant that apologizes after damage to a production-ready operator with checkpoints, proof, and enforcement.
- **Measure prompts instead of rewriting them blindly** โ€” use `thumbgate eval --from-feedback`, proof lanes, ThumbGate Bench, and `self-heal:check` to evaluate whether prompts and workflows actually improved behavior.

---

## Use Cases

### Developer Workflows
- **Stop force-push to main** โ€” Check blocks `git push --force` on protected branches before it runs
- **Prevent repeated migration failures** โ€” Each mistake becomes a searchable lesson that fires before the next attempt
- **Block unauthorized file edits** โ€” Control which files agents can touch with path-based rules
- **Memory across sessions** โ€” The agent remembers your feedback from yesterday
- **Shared team safety** โ€” One developer's thumbs-down protects the whole team
- **Auto-improving without feedback** โ€” Self-improvement mode evaluates outcomes and generates rules automatically

### Enterprise & Regulated Industries
- **Legal AI intake governance** โ€” Block unauthorized practice of law (ABA Rule 5.5), require conflict-of-interest clearance before fact collection (Rules 1.7/1.9/1.10), prevent privileged content from leaving firm boundaries (Rule 1.6)
- **Financial compliance** โ€” Gate AI-generated trade recommendations, block unauthorized disclosures, enforce approval chains before customer-facing outputs
- **Healthcare** โ€” Prevent AI agents from providing medical diagnoses, enforce HIPAA-compliant data routing, require clinician review before patient-facing content
- **Audit trail** โ€” Every gate decision (block, allow, reroute) is preserved with rule version, timestamp, and reviewer path for compliance review

[See the legal-intake demo โ†’](https://thumbgate.ai/dashboard)

---

## Built-in Checks

```
โ›” force-push โ†’ blocks git push --force
โ›” protected-branch โ†’ blocks direct push to main
โ›” unresolved-threads โ†’ blocks push with open reviews
โ›” package-lock-reset โ†’ blocks destructive lock edits
โ›” env-file-edit โ†’ blocks .env secret exposure

+ custom prevention rules for project-specific failures
```

---

## CLI Reference

```bash
npx thumbgate init # detect agent, wire hooks
npx thumbgate doctor # health check
npx thumbgate capture --feedback=up|down --context="" # capture a signal as a stored lesson
npx thumbgate lessons # see what's been learned
npx thumbgate explore # terminal explorer for lessons, checks, stats
npx thumbgate background-governance # review background-agent run risk
npx thumbgate model-candidates --workload=dashboard-analysis --provider=openai --json # evaluate GPT-5.5 routing
npx thumbgate native-messaging-audit # inspect local browser bridges and extension hosts
npx thumbgate dashboard # open local dashboard
npx thumbgate serve # start MCP server on stdio
npx thumbgate bench # run reliability benchmark
npx thumbgate bench --programbench-smoke # include cleanroom whole-repo proof lane
```

---

## Pricing

| | Free | Pro ($19/mo) | Team ($49/seat/mo) | Enterprise |
|---|---|---|---|---|
| Local CLI + enforced checks | โœ… | โœ… | โœ… | โœ… |
| Feedback captures | 5/day, 25 total | Unlimited | Unlimited | Unlimited |
| Auto-promoted prevention rules | 3 active | Unlimited | Unlimited | Unlimited |
| MCP agent integrations | All | All | All | All |
| Personal dashboard | โ€” | โœ… | โœ… | โœ… |
| DPO export (model fine-tuning) | โ€” | โœ… | โœ… | โœ… |
| Team lesson export/import | โ€” | โœ… | โœ… | โœ… |
| Shared hosted lesson DB | โ€” | โ€” | โœ… | โœ… |
| Org-wide dashboard | โ€” | โ€” | โœ… | โœ… |
| Approval + audit proof | โ€” | โ€” | โœ… | โœ… |
| Regulatory gate templates | โ€” | โ€” | โ€” | โœ… |
| Custom policy layers (firm/practice-area) | โ€” | โ€” | โ€” | โœ… |
| Compliance audit export | โ€” | โ€” | โ€” | โœ… |
| Dedicated onboarding + SLA | โ€” | โ€” | โ€” | โœ… |

The free tier gives you 5 feedback captures/day, 25 total captures, and up to 3 active auto-promoted prevention rules โ€” enough to prove value without replacing Pro for daily operators. MCP integrations for all agents (Claude Code, Cursor, Codex, Gemini, Amp, Cline, OpenCode) ship free.

Pro ($19/mo or $149/yr) removes the capture/rule caps and adds history-aware lesson recall, lesson search, DPO export, hosted sync, and a personal dashboard. Team ($49/seat/mo) adds a shared hosted lesson DB, org dashboard, and shared enforcement across the org. Enterprise adds regulatory gate templates (legal intake, financial compliance, healthcare), custom policy layers scoped to firm/practice-area, compliance audit export, and dedicated onboarding with SLA.

**Best first paid motion for teams:** the **Workflow Hardening Sprint** โ€” qualify one repeated failure before committing to a full rollout. **[Start intake โ†’](https://thumbgate.ai/?utm_source=github&utm_medium=readme&utm_campaign=team_rollout#workflow-sprint-intake)**

**Best first technical motion:** install the CLI-first and let `init` wire hooks for the agent you already use.

**Paid path for individual operators:** [ThumbGate Pro](https://thumbgate.ai/checkout/pro?utm_source=github&utm_medium=readme&utm_campaign=pro_page) is the self-serve side lane for a personal dashboard and export-ready evidence.

**[Start free](https://thumbgate.ai/?utm_source=github&utm_medium=readme)** ยท **[See Pro](https://thumbgate.ai/checkout/pro?utm_source=github&utm_medium=readme)** ยท **[Team Sprint intake](https://thumbgate.ai/?utm_source=github&utm_medium=readme#workflow-sprint-intake)**

---

## Team Lesson Sharing (Pro + Team)

One team's hard-won lessons shouldn't stay trapped on one laptop. ThumbGate Pro and Team can export lessons as portable bundles and import them into any other ThumbGate instance โ€” so a mistake caught by Team A becomes a prevention rule for Team B.

**Export lessons from one project:**

```bash
curl -X POST http://localhost:3456/v1/lessons/export \
-H "Authorization: Bearer $THUMBGATE_API_KEY" \
-H "Content-Type: application/json" \
-d '{"outputPath": "./lessons-export.json"}'
```

Filter by signal or tags:

```bash
curl -X POST http://localhost:3456/v1/lessons/export \
-H "Authorization: Bearer $THUMBGATE_API_KEY" \
-H "Content-Type: application/json" \
-d '{"signal": "down", "tags": ["push-notifications", "ci"]}'
```

**Import into another team's ThumbGate:**

```bash
curl -X POST http://localhost:3456/v1/lessons/import \
-H "Authorization: Bearer $THUMBGATE_API_KEY" \
-H "Content-Type: application/json" \
-d @lessons-export.json
```

What happens on import:
- **Deduplication** โ€” lessons with the same ID or title+signal are skipped
- **Provenance tracking** โ€” every imported lesson is tagged `team-import` with original source project, export timestamp, and original ID
- **No overwrite** โ€” import is additive; existing lessons are never modified

The export bundle includes full lesson metadata: signal, title, context, tags, failure type, skill, structured rules, and diagnosis. It's the same data you see in the lesson detail dashboard โ€” portable as JSON.

**Use cases:**
- Share enforcement patterns across repos in the same org
- Onboard a new team with pre-built lessons from a mature project
- Export lessons before a project handoff so institutional knowledge transfers
- Feed lessons from multiple teams into a centralized DPO training pipeline

---

## DPO Export for Fine-Tuning (Pro + Team)

Every thumbs-up and thumbs-down becomes a training signal. ThumbGate Pro exports your captured feedback as DPO (Direct Preference Optimization) pairs โ€” ready to feed into a LoRA fine-tune so your model stops repeating known mistakes at the weight level, not just the check level.

**Export DPO pairs:**

```bash
curl -X POST http://localhost:3456/v1/dpo/export \
-H "Authorization: Bearer $THUMBGATE_API_KEY" \
-o dpo-pairs.jsonl
```

**What you get:** JSONL where each line is a preference pair:
- `chosen` โ€” the agent action you thumbed up
- `rejected` โ€” the action you thumbed down for the same task context
- `prompt` โ€” the originating user intent

**Use cases:**
- Fine-tune Llama 3 / Mistral / local models with a LoRA adapter trained on your real mistakes
- Feed into RLAIF or KTO pipelines (KTO export also available via `/v1/kto/export`)
- Build a model that natively avoids your team's known failure patterns โ€” no check at inference time needed

**Why this matters:** Checks block mistakes. Fine-tuning prevents them from being attempted. Combine both for belt-and-suspenders governance.

---

## Tech Stack

| Layer | Technology |
|-------|-----------|
| **Storage** | SQLite + FTS5, LanceDB vectors, JSONL logs |
| **Capture** | 10/day on Free; unlimited on Pro/Team |
| **Intelligence** | MemAlign dual recall, Thompson Sampling |
| **Enforcement** | PreToolUse hook engine, Checks config |
| **Interfaces** | MCP stdio, HTTP API, CLI (Node.js >=18) |
| **Billing** | Stripe |
| **Execution** | Railway, Cloudflare Workers, Docker Sandboxes |
| **Governance** | Workflow Sentinel, control plane, Docker Sandboxes |

Every Changeset is tied to the exact `main` merge commit and generates Verification Evidence for Release Confidence.

---

**Popular buyer questions:** **[AI search topical presence](https://thumbgate.ai/guides/ai-search-topical-presence?utm_source=github&utm_medium=readme&utm_campaign=buyer_questions)** ยท **[Relational knowledge and AI recommendations](https://thumbgate.ai/guides/relational-knowledge-ai-recommendations?utm_source=github&utm_medium=readme&utm_campaign=buyer_questions)** ยท **[Background agent governance](https://thumbgate.ai/guides/background-agent-governance?utm_source=github&utm_medium=readme&utm_campaign=buyer_questions)** ยท **[GPT-5.5 model evaluation](https://thumbgate.ai/guides/gpt-5-5-model-evaluation?utm_source=github&utm_medium=readme&utm_campaign=buyer_questions)** ยท **[Stop repeated AI agent mistakes](https://thumbgate.ai/guides/stop-repeated-ai-agent-mistakes?utm_source=github&utm_medium=readme&utm_campaign=buyer_questions)** ยท **[Browser automation safety](https://thumbgate.ai/guides/browser-automation-safety?utm_source=github&utm_medium=readme&utm_campaign=buyer_questions)** ยท **[Native messaging host security](https://thumbgate.ai/guides/native-messaging-host-security?utm_source=github&utm_medium=readme&utm_campaign=buyer_questions)** ยท **[Autoresearch agent safety](https://thumbgate.ai/guides/autoresearch-agent-safety?utm_source=github&utm_medium=readme&utm_campaign=buyer_questions)** ยท **[Cursor guardrails](https://thumbgate.ai/guides/cursor-agent-guardrails?utm_source=github&utm_medium=readme&utm_campaign=buyer_questions)** ยท **[Codex CLI guardrails](https://thumbgate.ai/guides/codex-cli-guardrails?utm_source=github&utm_medium=readme&utm_campaign=buyer_questions)** ยท **[Gemini CLI memory + enforcement](https://thumbgate.ai/guides/gemini-cli-feedback-memory?utm_source=github&utm_medium=readme&utm_campaign=buyer_questions)** ยท **[Google Cloud MCP guardrails](https://thumbgate.ai/guides/gcp-mcp-guardrails?utm_source=github&utm_medium=readme&utm_campaign=buyer_questions)** ยท **[Roo Code alternative: migrate to Cline](https://thumbgate.ai/guides/roo-code-alternative-cline?utm_source=github&utm_medium=readme&utm_campaign=buyer_questions)**

**Conversational ad / AI-search answer assets:** **[AI Mode ads for agent governance](https://thumbgate.ai/guides/ai-mode-ads-agent-governance?utm_source=github&utm_medium=readme&utm_campaign=buyer_questions)** ยท **[MCP tool governance](https://thumbgate.ai/guides/mcp-tool-governance?utm_source=github&utm_medium=readme&utm_campaign=buyer_questions)** ยท **[AI agent pre-action approval gates](https://thumbgate.ai/guides/ai-agent-pre-action-approval-gates?utm_source=github&utm_medium=readme&utm_campaign=buyer_questions)**

**[Workflow Hardening Sprint](https://thumbgate.ai/?utm_source=github&utm_medium=readme&utm_campaign=top_cta#workflow-sprint-intake)** ยท **[Live Dashboard](https://thumbgate.ai/dashboard?utm_source=github&utm_medium=readme&utm_campaign=top_cta)**

---

## Integrations

- **[Open ThumbGate GPT](https://thumbgate.ai/go/gpt?utm_source=github&utm_medium=readme&utm_campaign=readme_gpt)** โ€” ThumbGate GPT: start here. Paste agent actions, get advice + checkpointing. No, users do not have to keep chatting inside the ThumbGate GPT to use ThumbGate โ€” the hard enforcement layer still runs where the work happens.
- **[Claude Desktop Extension](https://github.com/IgorGanapolsky/ThumbGate/releases/latest/download/thumbgate-claude-desktop.mcpb)** โ€” One-click install for Claude Desktop
- **[Codex Plugin](https://thumbgate.ai/codex-plugin)** โ€” Auto-updating standalone bundle and install page for Codex CLI
- **[VS Code / Open VSX Extension](plugins/vscode-extension/README.md)** โ€” Marketplace-ready MCP provider and `.vscode/mcp.json` fallback for VS Code-compatible IDEs
- **[Antigravity-compatible VSIX](plugins/antigravity-extension/INSTALL.md)** โ€” Open VSX/direct VSIX install path while Antigravity-specific marketplace support is still unproven
- **[JetBrains Plugin Scaffold](plugins/jetbrains-plugin/README.md)** โ€” IntelliJ/PyCharm Marketplace path for the same `thumbgate@latest` runtime
- **[Perplexity Command Center](docs/PERPLEXITY_MAX_COMMAND_CENTER.md)** โ€” AI-search visibility + lead discovery
- **[ThumbGate Bench](docs/THUMBGATE_BENCH.md)** โ€” Reliability benchmark and ProgramBench-style cleanroom proof lane
- **[Manus AI Skill](skills/thumbgate/SKILL.md)** โ€” ThumbGate integration for Manus AI agents

---

## Feedback Sessions

Give the agent more context when a thumbs-down isn't enough:

```
๐Ÿ‘Ž thumbs down
โ””โ”€โ–บ open_feedback_session
โ””โ”€โ–บ "you lied about deployment" (append_feedback_context)
โ””โ”€โ–บ "tests were actually failing" (append_feedback_context)
โ””โ”€โ–บ finalize_feedback_session
โ””โ”€โ–บ lesson inferred from full conversation
```

Free and self-hosted users can invoke `search_lessons` directly through MCP, and via the CLI with `npx thumbgate lessons`. History-aware feedback sessions give the agent full context for each lesson.

---

## FAQ

**Is ThumbGate a model fine-tuning tool?**
No. ThumbGate does not update model weights. It captures feedback, stores lessons, injects context at runtime, and blocks bad actions before they execute.

**How is this different from CLAUDE.md or .cursorrules?**
Those are suggestions the agent can ignore. ThumbGate checks are enforced โ€” they physically block the action before it runs. They also auto-generate from feedback instead of requiring manual writing.

**Does it work with my agent?**
If it supports MCP or pre-action hooks, yes. Claude Code, Claude Desktop, Cursor, Codex, Gemini CLI, Amp, Cline, OpenCode all work out of the box.

**Is it free?**
The free tier gives you 5 feedback captures/day, 25 total captures, and up to 3 active auto-promoted prevention rules. MCP integrations ship free for every agent.

Pro ($19/mo or $149/yr) removes the capture/rule caps and adds history-aware lesson recall, lesson search, hosted sync, and a personal dashboard. Team ($49/seat/mo) adds a shared hosted lesson DB, org dashboard, and shared enforcement.

---

## Docs

- [**ThumbGate for Federal Agencies**](docs/FEDERAL.md) โ€” pilot-ready posture, NIST 800-53 control mapping, OMB M-24-10 / EO 14110 alignment, ThumbGate-Core gov deployment mode, public/Core boundary invariants. Landing page: [thumbgate.ai/federal](https://thumbgate.ai/federal).
- [First Dollar Playbook](docs/FIRST_DOLLAR_PLAYBOOK.md) โ€” turning one painful workflow into the next booked pilot
- [Commercial Truth](docs/COMMERCIAL_TRUTH.md) โ€” pricing, claims, what we don't say
- [Goal Contracts](docs/GOAL_CONTRACTS.md) โ€” evidence-before-done contracts for multi-agent handoffs
- [Changeset Strategy](docs/CHANGESET_STRATEGY.md) โ€” release notes and version bump enforcement
- [Release Confidence](docs/RELEASE_CONFIDENCE.md) โ€” changesets, version checks, proof lanes
- [Verification Evidence](docs/VERIFICATION_EVIDENCE.md) โ€” proof artifacts
- [Claude Desktop Extension Guide](docs/CLAUDE_DESKTOP_EXTENSION.md)
- [Agent Workflow Contract](WORKFLOW.md) โ€” the agent-run contract for all ThumbGate operations
- [Ready for Agent Intake](https://github.com/IgorGanapolsky/ThumbGate/issues/new?template=ready-for-agent.yml) โ€” ready-for-agent intake template
- [SEO Guide: Claude Code Guardrails](docs/learn/claude-code-guardrails.md)
- [Unsupervised Learning Signals](docs/UL.md) โ€” silent-failure clustering (experimental, behind `THUMBGATE_SILENT_FAILURE_CLUSTERING=1`; only useful on workspaces with โ‰ฅ 50 tool calls/day)
- [ThumbGate-Core](https://github.com/IgorGanapolsky/ThumbGate-Core) โ€” private core for hosted overlays, ranking, policy synthesis, billing intelligence, and org/team workflows
- [mac-yolo-safeguards](https://github.com/IgorGanapolsky/mac-yolo-safeguards?utm_source=thumbgate&utm_medium=readme&utm_campaign=companion_kit) โ€” OS-level companion kit (macOS). ThumbGate stops the agent from billing you for repeated mistakes (token-layer governance). mac-yolo-safeguards stops the agent from freezing your Mac when it spawns runaway processes (OS-layer blast-radius containment). Same author, MIT, no telemetry.

---

## License

MIT. See [LICENSE](LICENSE).