An open API service indexing awesome lists of open source software.

https://github.com/zer0contextlost/graft

Detect foreign code grafted into your supply chain. Byte-level anomaly detection for CI/CD pipelines.
https://github.com/zer0contextlost/graft

Last synced: 4 days ago
JSON representation

Detect foreign code grafted into your supply chain. Byte-level anomaly detection for CI/CD pipelines.

Awesome Lists containing this project

README

          

# GRAFT

Detect foreign code grafted into your supply chain.

GRAFT is a CI tool that trains a byte-level compression model on your repo's
own source history, then scores every PR-changed file for byte patterns that
deviate from the learned baseline. No signature database. No rule engine.
No labeled attack examples.

---

## How it works

A lossless compression model assigns short codes to sequences it has learned
to predict and long codes to sequences it has not. GRAFT trains an autoregressive
next-byte predictor on your repo's benign-only source files, then measures
**bits-per-byte (BPB)** on each changed file at PR time:

- **Low BPB** — model is not surprised — byte patterns are consistent with your codebase
- **High BPB** — model is surprised — byte patterns deviate from your codebase

Supply chain attacks graft foreign content into legitimate files: obfuscated
eval chains, base64-encoded payloads, binary blobs stuffed into comments,
minified implants in build scripts. These all look radically different from
normal source code at the byte level. GRAFT catches them because they score
high BPB against a model trained on nothing but your own code.

The underlying architecture is ByteFlow Net (Deng et al., ICLR 2026), implemented
in [SUBSTRATE](https://github.com/zer0contextlost/substrate). GRAFT adapts it as
a per-file anomaly scorer rather than a language model.

---

## Detection results

Trained on ~46 KB of Python and YAML source (3000 steps, ~3 min on `ubuntu-latest`):

| Payload type | Max BPB | Sigma above baseline | Verdict |
|---|---|---|---|
| Normal Python source | 2.31 | +2.6 | MEDIUM |
| Normal YAML workflow | 1.97 | +1.8 | CLEAN |
| Base64 blob embedded in .py | 9.64 | **+17.9** | CRITICAL |
| Eval obfuscation chain | 11.89 | **+22.6** | CRITICAL |
| Null-byte stuffed config | 13.33 | **+25.6** | CRITICAL |
| Minified / obfuscated JS | 9.04 | **+16.6** | CRITICAL |

Baseline: mean=1.09 BPB, std=0.48, threshold=2.52 (3σ).
All four attack types score 8–26σ above the benign baseline — no overlap.

---

## Quick start

### GitHub Actions

Add to `.github/workflows/graft.yml` in any repo you want to protect:

```yaml
name: GRAFT Supply Chain Scan

on:
pull_request:
types: [opened, synchronize, reopened]

permissions:
pull-requests: write
contents: read

jobs:
graft-scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0

- uses: zer0contextlost/graft@v1
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
```

First PR after setup trains the baseline model (~3 min) and caches it.
Subsequent PRs reuse the cache. Model retrains weekly automatically.

### Gitea Actions

Copy `ci/gitea-workflow.yml` to `.gitea/workflows/graft.yml`.
Requires Gitea 1.21+ with Actions enabled.

### GitLab CI

Copy `ci/gitlab-ci.yml` into `.gitlab-ci.yml` (or include from it).
Set `GITLAB_TOKEN` as a CI/CD variable with `api` scope.

### Generic CI (Jenkins, Drone, Woodpecker, Forgejo, pre-push hook)

```bash
export BASE_SHA="$(git merge-base origin/main HEAD)"
export HEAD_SHA="$(git rev-parse HEAD)"
bash ci/generic.sh
```

To post comments on a Gitea-compatible forge (Forgejo, Codeberg, etc.):

```bash
export PR_NUMBER=42
export REPO="owner/myrepo"
export API_TOKEN="your-token"
export API_URL="https://forgejo.example.com/api/v1/repos/{repo}/issues/{pr}/comments"
bash ci/generic.sh
```

---

## Configuration

| Input | Default | Description |
|---|---|---|
| `github-token` | required | `secrets.GITHUB_TOKEN` — for posting PR comments |
| `threshold-sigma` | `3.0` | Flag files with BPB > `mean + sigma * std`. Lower = more sensitive |
| `train-steps` | `3000` | Training steps when building baseline. 3000 is sufficient for most repos |
| `max-files` | `50` | Max changed files to scan per PR |
| `fail-on-anomaly` | `false` | Set to `true` to fail the check when HIGH/CRITICAL files are found |

Severity tiers:

| Tier | Condition |
|---|---|
| CRITICAL | BPB > baseline + 5σ |
| HIGH | BPB > baseline + 3σ |
| MEDIUM | BPB > baseline + 2σ |
| CLEAN | within 2σ |

---

## What GRAFT scans

Changed files with these extensions are scored:

**Source code** — `.py` `.js` `.ts` `.mjs` `.jsx` `.tsx` `.go` `.rs` `.java` `.kt` `.scala` `.c` `.cpp` `.h` `.rb` `.php` `.lua` `.swift`

**Build / CI / infra** — `.sh` `.bash` `.ps1` `.yml` `.yaml` `.mk` `.cmake` `.tf` `.hcl` `.dockerfile` and named files `Makefile` `Dockerfile` `Gemfile` `Pipfile`

**Package manifests and lockfiles** — `.toml` `.lock` `.json` `.xml` `.gradle` `.gemspec` `requirements.txt`

Binary files, files under 64 bytes, and files over 512 KB are skipped automatically.

---

## PR comment format

GRAFT posts a comment on each scanned PR with a ranked table of findings:

```
## GRAFT Supply Chain Scan

Baseline: mean=1.086 std=0.478 threshold=2.520 BPB (3.0s) | 8 file(s) scanned

3 file(s) flagged:

| Severity | File | Max BPB | Mean BPB | Sigma | Windows |
|----------|------|---------|---------|-------|---------|
| CRITICAL | src/utils.py | 9.641 | 8.804 | +17.9 | 4 |
| HIGH | setup.py | 3.812 | 3.104 | +5.7 | 2 |
| MEDIUM | Makefile | 2.683 | 2.401 | +3.3 | 1 |

5 file(s) within normal range.
```

---

## How the baseline is built

1. `git ls-files` enumerates all source files tracked at HEAD
2. Binary files and files under 64 bytes are excluded
3. Files are packed into a binary corpus (SUBSTRATE format)
4. A MICRO model (~3M params, T=512, K=64) trains for `train-steps` steps on CPU
5. A random 20% holdout calibrates `baseline_mean` and `baseline_std`
6. The checkpoint is cached with a weekly key; retrained weekly automatically

The model learns your repo's byte distribution — not generic source code patterns.
A repo that mixes Python, Rust, and YAML will have a baseline that reflects all three.
A repo that's all Go will have a Go-specific baseline.

---

## Limitations

**False positives:** generated files (protobuf output, minified vendored JS, binary test
fixtures) will score high even when legitimate. Add a `.graftignore` or use `max-files`
to exclude paths, or raise `threshold-sigma`.

**False negatives:** attacks that carefully mimic the repo's existing code style at the
byte level — e.g., a backdoor that uses the same variable naming, whitespace conventions,
and import patterns as the surrounding file — may score within the normal range. GRAFT
is a surprise-based detector, not a semantic one.

**Cold start:** the first PR after setup trains the model. For repos with fewer than
~4 KB of source, training may not produce a meaningful baseline.

**Threshold is per-repo:** the default 3σ threshold is calibrated against the training
corpus holdout. Repos with highly heterogeneous file types (mixing minified CSS with Go
source, for instance) may see elevated false positive rates until the model has enough
training data to learn both distributions.

---

## Architecture

GRAFT uses the SUBSTRATE framework, which implements ByteFlow Net
(Deng et al., ICLR 2026 — [arXiv:2603.03583](https://arxiv.org/abs/2603.03583))
as an anomaly scoring engine. The MICRO config used here:

| Parameter | Value |
|---|---|
| Params | ~3.2M |
| Window (T) | 512 bytes |
| Chunk positions (K) | 64 |
| Local dim | 128 |
| Global dim | 256 |
| Layers | 1 / 2 / 1 (enc / global / dec) |
| Training | CPU, AdamW, cosine LR, ~3 min |

Full architecture documentation: [SUBSTRATE](https://github.com/zer0contextlost/substrate)

---

*zer0contextlost — 2026*