https://github.com/kosli-dev/agentic-sdlc-demo

Reference implementation of an agentic SDLC with AI-driven controls for regulated financial services. Demonstrates the control points needed when AI writes production code.
https://github.com/kosli-dev/agentic-sdlc-demo

Last synced: 3 months ago
JSON representation

Reference implementation of an agentic SDLC with AI-driven controls for regulated financial services. Demonstrates the control points needed when AI writes production code.

Host: GitHub
URL: https://github.com/kosli-dev/agentic-sdlc-demo
Owner: kosli-dev
Created: 2026-03-24T19:26:45.000Z (3 months ago)
Default Branch: main
Last Pushed: 2026-03-26T18:55:19.000Z (3 months ago)
Last Synced: 2026-03-26T20:07:03.992Z (3 months ago)
Language: Python
Size: 516 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 13
Metadata Files:
- Readme: README.md
- Codeowners: .github/CODEOWNERS

Awesome Lists containing this project

README

# Agentic SDLC Demo

A reference implementation of an **agentic SDLC** — what happens when AI agents
write production code and you need to prove that proper controls were in place.

The application (a payment processing service) exists to give the agents
something realistic to work on. The interesting part is the workflow: how
changes flow from ticket to production, what evidence is collected at each
step, and how that evidence is evaluated.

All control evidence is attested to [Kosli](https://www.kosli.com) for
governance, audit, and compliance.

## How It Works

A change flows through two Kosli flows:

```
GitHub Issue
│
▼
┌─────────────────────────────────────────────────────┐
│ Code Review Flow (per issue, multiple trails) │
│ │
│ Loop 1 ──▶ Loop 2 ──▶ ... ──▶ Final │
│ │
│ Each loop: │
│ 1. Pre-review CI gate (lint, tests, Docker) │
│ 2. Ticket integrity check (locked, untampered) │
│ 3. Injection scan (regex, before AI sees diff) │
│ 4. Change classification + persona selection │
│ 5. 10 AI review agents (5 personas × 2 models) │
│ 6. Finding dedup + cross-model severity │
│ 7. Moderator debate + resolution │
│ 8. Resolver: triage findings, commit fixes │
│ 9. Cost tracking │
│ │
│ 97 attestations per trail │
└─────────────────────────────────────────────────────┘
│
│ PR merged
▼
┌─────────────────────────────────────────────────────┐
│ Build Flow (per issue, one trail per PR) │
│ │
│ 1. Source directory fingerprinted (artifact) │
│ 2. Docker build + health check │
│ 3. 10 Rego policy control gates │
│ (evaluate evidence from Code Review flow) │
│ │
│ All controls must pass for trail to be COMPLIANT │
└─────────────────────────────────────────────────────┘
```

## Build Flow Controls

These are the control gates that run at build time. Each one uses
`kosli evaluate trails` with a Rego policy to check evidence that was
collected during the code review flow. The Rego policy source, evaluation
result, and violations are embedded in each attestation's `user_data` for
full auditability.

| Control | What It Checks | Why It Matters |
|---|---|---|
| **code-review-control** | All review trails compliant, Final trail exists | Proves a substantive multi-model review happened |
| **ticket-integrity-control** | `ticket-integrity` attestation exists and compliant in every loop | Proves the ticket was locked and untampered throughout |
| **lint-control** | `pre-review-lint` passed every loop | Proves code met style/quality standards before review |
| **unit-test-control** | `pre-review-unit-tests` passed every loop | Proves unit tests passed before review |
| **integration-test-control** | `pre-review-integration-tests` passed every loop | Proves integration tests passed before review |
| **cost-control** | `loop-cost` within budget in every loop | Proves AI compute stayed within budget |
| **artifact-integrity-control** | Source dir SHA256 from review matches CI build | Proves "code reviewed = code built" |
| **injection-scan-control** | `diff-injection-scan` ran, 0 candidates, 0 payloads | Proves diff was scanned for prompt injection before AI agents saw it |
| **resolver-completeness-control** | `resolver-threads-resolved` shows all findings accounted for, 0 open threads | Proves the resolver didn't drop or fabricate findings |
| **review-quality-control** | All 5 orchestration slots present and compliant (classification, severity, dedup, debate, resolution) | Proves the review process was substantive, not a rubber stamp |

### How controls are evaluated

Each control follows the same pattern:

1. `kosli evaluate trails` runs the Rego policy against all trails in the code review flow
2. The policy returns `allow: true/false` and a list of `violations` (human-readable strings)
3. The result is attested via `kosli attest generic --compliant=$ALLOWED` with the Rego source, evaluation result, and violations as `--user-data`
4. If any control is non-compliant, the build trail is non-compliant

Rego policies live in `kosli/policies/`. Every policy has a corresponding
`_test.rego` file with OPA test cases.

### Strengths

- **Separation of evidence and evaluation** — the code review flow produces
evidence (attestations with structured payloads). The build flow evaluates
that evidence with Rego policies. The two concerns are decoupled.
- **Policy-as-code** — controls are Rego files in version control, not
configuration in a UI. They're testable, reviewable, and auditable.
- **Evidence embedded in attestations** — each control attestation includes
the Rego policy source, the evaluation result, and the violations. An auditor
can see exactly what was checked and why it passed or failed.
- **Deterministic artifact identity** — source directory fingerprinting
(`--artifact-type dir`) produces the same SHA256 regardless of which machine
computes it. This lets us prove "code reviewed = code built".
- **No silent passes** — controls use `kosli attest generic` with an explicit
`--compliant` flag. There's no path where a control silently passes because
a jq rule is missing.

### Known Limitations

- **`kosli-dev/setup-cli-action@v2`** is still on Node.js 20 (no v3 release).
The deprecation warning will persist until Kosli ships a Node 24 build.
- **Rego policy source in `user_data`** renders as raw JSON with `\n` and `\t`
escape characters in the Kosli UI. The data is correct but hard to read
inline. This is a Kosli UI limitation — multiline strings aren't rendered
as formatted text. Attachments via `--attachments` are an alternative
(stored in Evidence Vault, downloadable) but not visible inline.
- **No external configuration for Rego policies** — `kosli evaluate` doesn't
support `--data` for external values. Budget limits and expected counts are
hardcoded in the Rego files.
- **`kosli evaluate` output returns `violations: null`** instead of
`violations: []` when there are no violations. Handled with `// []` in jq.
- **Custom attestation types can't be used for control gates** —
`kosli attest custom` has no `--compliant` flag, and custom types without
jq evaluator rules are silently always compliant. This means control gates
must use `generic` type, which has less structured rendering in the UI.

## Code Review Flow Evidence

Each review loop trail contains 97 attestation slots:

| Category | Slots | Count |
|---|---|---|
| Pre-review CI | lint, unit tests, integration tests, Docker build | 4 |
| Ticket integrity | locked, snapshot, verified | 1 |
| Injection scan | regex-based, pre-review | 1 |
| Orchestration | classification, prior context, dedup, severity, debate, resolution | 6 |
| Agent reviews | 8 steps × 5 personas × 2 models (bound to source artifact) | 80 |
| Resolver | threads fetched, triaged, fixes committed, threads resolved | 4 |
| Cost tracking | per-loop AI compute budget | 1 |

### Review personas (5)
- **Security & Compliance** — financial logic, decimal precision, PII, audit trails
- **Architecture & Patterns** — layer violations, dependency direction, `.standards/` compliance
- **Reliability & Infrastructure** — Docker, CI, health checks, deployment config
- **Test Quality** — assertion quality, edge cases, coverage theatre
- **API Surface** — breaking changes, validation, error consistency

### Review models (2)
- **Claude** (Anthropic)
- **Google Gemini** (gemini-3-flash-preview)

Each persona runs on both models independently. Cross-model consensus identifies
high-confidence findings (both models agree) and model-specific findings.

## Security

See [SECURITY.md](SECURITY.md) for the full security architecture, contributor
guidelines, and vulnerability reporting process.

## The Application

A **Payment Transaction Processing Service** built with FastAPI. Accounts,
transactions, fraud detection, fee calculation — representative of regulated
financial services.

```bash
# Run tests
PYTHONPATH=src pytest tests/unit -m unit -v
PYTHONPATH=src pytest tests/integration -m integration -v

# Run locally
PYTHONPATH=src uvicorn payments.app:app --reload --port 8080

# Docker
docker compose up --build
```

## Project Layout

```
.github/workflows/
├── agentic-code.yml # Coding agent → review → merge
├── ci.yml # Build flow: Docker + 10 Rego control gates
├── pipeline-tests.yml # Test suite for pipeline infrastructure + security lint
├── pr-loop.yml # Review loop orchestration
├── pr-resolve.yml # Finding resolver
├── pr-review.yml # Review agent dispatch
├── kosli-setup-flows.yml # Flow template management
└── kosli-setup-types.yml # Attestation type management

kosli/
├── flows/
│ ├── build-template.yml # Build flow: 1 trail + 11 artifact attestations
│ └── code-review-template.yml # Review flow: 97 attestations per trail
├── policies/ # Rego policies + OPA test files
│ ├── code-review-control.rego
│ ├── ticket-integrity-control.rego
│ ├── lint-control.rego
│ ├── unit-test-control.rego
│ ├── integration-test-control.rego
│ ├── cost-control.rego
│ ├── artifact-integrity-control.rego
│ ├── injection-scan-control.rego
│ ├── resolver-completeness-control.rego
│ └── review-quality-control.rego
└── attestation-types/ # Custom type schemas + jq evaluator rules

scripts/
├── coding/ # Coding agent infrastructure
├── review/ # Review pipeline (orchestrator, agents, resolver)
└── ci/ # Build pipeline scripts

src/payments/ # The application
tests/ # Unit + integration tests
.standards/ # Architecture, Python, Docker, testing, CI standards
```

## Links

- [Kosli](https://www.kosli.com) — change management and compliance platform
- [SECURITY.md](SECURITY.md) — security architecture, contributor guidelines, vulnerability reporting
- [The controls that nobody wrote down...](https://www.linkedin.com/pulse/controls-nobody-wrote-down-alex-kantor-o2zte/) — the article that started this

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kosli-dev/agentic-sdlc-demo

Awesome Lists containing this project

README