https://github.com/asklar/agentsafe-demos

A reusable lab for small, recordable, open-source agentic-security demos: one tiny tool-call interception shim, governed-vs-ungoverned side by side.
https://github.com/asklar/agentsafe-demos

agent-governance agentic-security ai-agents ai-safety ai-security guardrails llm llm-security mcp prompt-injection python

Last synced: about 13 hours ago
JSON representation

A reusable lab for small, recordable, open-source agentic-security demos: one tiny tool-call interception shim, governed-vs-ungoverned side by side.

Host: GitHub
URL: https://github.com/asklar/agentsafe-demos
Owner: asklar
License: apache-2.0
Created: 2026-07-01T07:37:51.000Z (2 days ago)
Default Branch: main
Last Pushed: 2026-07-01T09:03:17.000Z (2 days ago)
Last Synced: 2026-07-01T11:05:12.069Z (2 days ago)
Topics: agent-governance, agentic-security, ai-agents, ai-safety, ai-security, guardrails, llm, llm-security, mcp, prompt-injection, python
Language: Python
Size: 45.9 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Notice: NOTICE

Awesome Lists containing this project

README

          # agentsafe-demos — a lab for agentic-security demos

[![CI](https://github.com/asklar/agentsafe-demos/actions/workflows/ci.yml/badge.svg)](https://github.com/asklar/agentsafe-demos/actions/workflows/ci.yml)

[![Python 3.10+](https://img.shields.io/badge/python-3.10%2B-blue.svg)](https://www.python.org/downloads/)

[![License: Apache-2.0](https://img.shields.io/badge/license-Apache--2.0-green.svg)](./LICENSE)

**A reusable foundation for building small, recordable, open-source demos about

agent governance.** Every demo is middleware over one tiny tool-call

interception shim, so the second demo costs a fraction of the first and each one

becomes shareable content with a single capture command.

> Autonomous agents are only as safe as the tools they can reach. This lab makes

> that concrete: it puts a governable seam between an agent and its tools, and

> shows — side by side — what happens with and without it.

## See it in 10 seconds

A support agent reads a booby-trapped ticket that tells it to mail your Stripe

key to an attacker. Same agent, same plan, run three ways — a *reasonable*

allow/deny policy still lets the secret out; a taint tripwire stops it at the

tool boundary and still lets the legitimate reply through:

```text

=== POLICY-ONLY agent (allow/deny gate) ===

  EXEC  read_file   — read credentials  <-- injected step

  EXEC  http_post   — POST creds off-box  <-- injected exfil

  -> 6 executed, 0 blocked; !!! SECRET EXFILTRATED

=== TRIPWIRE agent (policy + taint tripwire) ===

  EXEC  read_file   — read credentials  <-- injected step

  BLOCK http_post   — POST creds off-box  <-- injected exfil   [taint-tripwire]

  EXEC  http_post   — reply on the ticket (legit)

  -> 5 executed, 1 blocked; OK  secret contained

```

Prompt injection is a *data-flow* problem: track where untrusted data goes, or

lose. Run it yourself below — ~250 lines, zero dependencies, one command.

## Why this exists

Shipping one demo is easy; shipping a demo *every week that survives public

scrutiny* needs scaffolding. `agentlab` provides three things so the marginal

demo is cheap:

1. **A shared shim** ([`shim/`](./shim)) — the tool-call interception substrate

   every demo builds on. See [`shim/README.md`](./shim/README.md).

2. **A demo template** ([`demos/_template`](./demos/_template)) — drop-in

   scaffold with a runnable `demo.py`, `run.sh`, `SCRIPT.md`, and a smoke test.

3. **A capture pipeline** ([`capture/`](./capture)) — one command turns a demo

   into a text log / terminal recording / GIF.

## Quick start

```bash

# Run the reference demo (no dependencies beyond Python 3.10+)

./demos/00-hello-gate/run.sh

# Run the whole test suite (stdlib only)

python3 run_tests.py

# Scaffold a new demo

./new_demo.sh injection-tripwire "Injection Tripwire"

# Capture a demo as shareable content

capture/record.sh 00-hello-gate

```

## The shared shim in one screen

Each demo wires the same `Interceptor` with different middleware. A middleware

can veto a call before it runs (`before`) and/or observe it after (`after`):

```python

from shim import Interceptor, make_toolbox

from shim.middleware import PolicyGate, InjectionTripwire, FlightRecorder

ix = Interceptor(

    make_toolbox("/tmp/agent-workspace"),

    middleware=[PolicyGate(rules), InjectionTripwire(), FlightRecorder()],

)

ix.call("delete_file", path="production.db")   # -> Blocked, tool never runs

```

That's the whole idea: **the three roadmap concepts are three middleware, not

three frameworks.**

| Roadmap concept | Middleware | Status |

| --- | --- | --- |

| A. Guardrail Proxy (policy gate) | `PolicyGate` | flagship demo (SKL-3) |

| B. Injection Tripwire (taint) | `InjectionTripwire` | [demo #2](./demos/01-injection-tripwire) (SKL-6) ✅ |

| C. Agent Flight Recorder (ledger) | `FlightRecorder` + `AuditSink` | demo #3 |

The reference demo [`demos/00-hello-gate`](./demos/00-hello-gate) composes all

three at once to prove the substrate generalizes;

[`demos/01-injection-tripwire`](./demos/01-injection-tripwire) is the focused

Concept B demo — it shows an allow/deny policy is *not enough* to stop a

prompt-injection exfil, and a taint tripwire is.

> **Standalone primitive.** Concept A also has its own focused home — the

> **guardrail** repo — the single-idea policy-gate primitive with its own README

> and quick start, for readers who just want that one control. This lab is where

> the three concepts compose on a shared shim and become repeatable content.

## Repo layout

```

agentlab/

├── shim/                 # shared tool-call interception substrate + tests

│   ├── interceptor.py    #   Interceptor + Middleware chain + Blocked

│   ├── middleware.py     #   PolicyGate / InjectionTripwire / FlightRecorder

│   ├── audit.py          #   hash-chained JSONL audit sink

│   └── toolbox.py        #   shared mock tools

├── demos/

│   ├── _template/        # drop-in scaffold for a new demo

│   ├── 00-hello-gate/    # reference demo that exercises the whole substrate

│   └── 01-injection-tripwire/  # Concept B: taint tracking beats allow/deny policy

├── capture/              # one-command demo-capture pipeline

├── new_demo.sh           # scaffold a new demo

├── run_tests.py          # dependency-free test runner

├── HOWTO.md              # how to add a new demo (the short version)

├── CONTRIBUTING.md

└── LICENSE               # Apache-2.0

├── NOTICE                # Apache-2.0 NOTICE

```

## Adding a demo

See [`HOWTO.md`](./HOWTO.md). The short version: `./new_demo.sh `, edit the

plan and middleware, run it, capture it.

## License

Apache-2.0 licensed — see [`LICENSE`](./LICENSE) and [`NOTICE`](./NOTICE).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/asklar/agentsafe-demos

Awesome Lists containing this project

README