{"id":44073196,"url":"https://github.com/cyntrisec/ephemeralml","last_synced_at":"2026-02-18T13:01:02.226Z","repository":{"id":335882655,"uuid":"1147310305","full_name":"cyntrisec/EphemeralML","owner":"cyntrisec","description":"Confidential AI inference with cryptographic proof of ephemeral execution. Loads models inside TEEs, returns embeddings + signed Attested Execution Receipts.","archived":false,"fork":false,"pushed_at":"2026-02-16T01:58:53.000Z","size":1733,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-02-16T08:50:40.218Z","etag":null,"topics":["aws","confidential-computing","encryption","hpke","machine-learning","nitro-enclaves","rust","tee"],"latest_commit_sha":null,"homepage":"https://ephemeralml.cyntrisec.com","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cyntrisec.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-02-01T14:52:24.000Z","updated_at":"2026-02-16T01:58:57.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/cyntrisec/EphemeralML","commit_stats":null,"previous_names":["cyntrisec/ephemeralml"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/cyntrisec/EphemeralML","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cyntrisec%2FEphemeralML","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cyntrisec%2FEphemeralML/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cyntrisec%2FEphemeralML/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cyntrisec%2FEphemeralML/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cyntrisec","download_url":"https://codeload.github.com/cyntrisec/EphemeralML/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cyntrisec%2FEphemeralML/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29580625,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-18T08:38:15.585Z","status":"ssl_error","status_checked_at":"2026-02-18T08:38:14.917Z","response_time":162,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aws","confidential-computing","encryption","hpke","machine-learning","nitro-enclaves","rust","tee"],"created_at":"2026-02-08T05:45:51.960Z","updated_at":"2026-02-18T13:01:02.220Z","avatar_url":"https://github.com/cyntrisec.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"```\n ▄████▄    ███████╗██████╗ ██╗  ██╗███████╗███╗   ███╗███████╗██████╗  █████╗ ██╗     ███╗   ███╗██╗\n██▀██▀██   ██╔════╝██╔══██╗██║  ██║██╔════╝████╗ ████║██╔════╝██╔══██╗██╔══██╗██║     ████╗ ████║██║\n██ ██ ██   █████╗  ██████╔╝███████║█████╗  ██╔████╔██║█████╗  ██████╔╝███████║██║     ██╔████╔██║██║\n████████   ██╔══╝  ██╔═══╝ ██╔══██║██╔══╝  ██║╚██╔╝██║██╔══╝  ██╔══██╗██╔══██║██║     ██║╚██╔╝██║██║\n██▄██▄██   ███████╗██║     ██║  ██║███████╗██║ ╚═╝ ██║███████╗██║  ██║██║  ██║███████╗██║ ╚═╝ ██║███████╗\n ▀ ▀▀ ▀    ╚══════╝╚═╝     ╚═╝  ╚═╝╚══════╝╚═╝     ╚═╝╚══════╝╚═╝  ╚═╝╚═╝  ╚═╝╚══════╝╚═╝     ╚═╝╚══════╝\n```\n\n[![CI](https://github.com/cyntrisec/EphemeralML/actions/workflows/ci.yml/badge.svg)](https://github.com/cyntrisec/EphemeralML/actions/workflows/ci.yml)\n[![Status](https://img.shields.io/badge/Status-v3.1%20GPU%20Confidential-brightgreen?style=for-the-badge)](https://github.com/cyntrisec/EphemeralML/releases/tag/v3.1.0)\n[![Tests](https://img.shields.io/badge/Tests-105%20Passing-success?style=for-the-badge)](https://github.com/cyntrisec/EphemeralML/actions/workflows/ci.yml)\n[![Platform](https://img.shields.io/badge/Platform-AWS%20Nitro%20|%20GCP%20TDX%20|%20GPU%20H100-orange?style=for-the-badge\u0026logo=amazon-aws)](https://aws.amazon.com/ec2/nitro/nitro-enclaves/)\n[![Language](https://img.shields.io/badge/Language-Rust-b7410e?style=for-the-badge\u0026logo=rust\u0026logoColor=white)](https://www.rust-lang.org/)\n[![License](https://img.shields.io/badge/Apache%202.0-blue?style=for-the-badge)](LICENSE)\n\n# EphemeralML\n\n**Confidential AI inference with hardware-backed attestation — multi-cloud**\n\n\u003e Run AI models where prompts and weights stay encrypted — even if the host is compromised. Deploys on AWS Nitro Enclaves, GCP Confidential Space (Intel TDX), and GPU TEEs (NVIDIA H100 CC-mode).\n\n---\n\n## Why EphemeralML?\n\n| Problem | Solution |\n|---------|----------|\n| Cloud hosts can see your data | **TEE isolation** — data decrypted only inside the enclave |\n| \"Trust me\" isn't enough | **Cryptographic attestation** — verify code before sending secrets |\n| No audit trail | **Execution receipts** — proof of what code processed your data |\n\n**Built for**: Defense, GovCloud, Finance, Healthcare — anywhere \"good enough\" security isn't.\n\n---\n\n## Architecture\n\n### AWS Nitro Enclaves\n\n```\n                        ┌──────────────────────────────────────────┐\n                        │           Pipeline Orchestrator           │\n┌─────────┐  HPKE      │  ┌─────────┐  SecureChannel  ┌────────┐ │\n│  Client │◄───────────►│  │  Host   │◄──────────────►│Enclave │ │\n└─────────┘  encrypted  │  │ (blind  │   attestation-  │Stage 0 │ │\n                        │  │  relay) │   bound AEAD    └────────┘ │\n                        │  └─────────┘                            │\n                        └──────────────────────────────────────────┘\n                               │                          │ NSM\n                               │ S3                       ▼\n                        ┌──────┴──────┐            ┌───────────────┐\n                        │  Encrypted  │            │    AWS KMS    │\n                        │   Models    │            │ (key release) │\n                        └─────────────┘            └───────────────┘\n```\n\n### GCP Confidential Space (Intel TDX)\n\n```\n┌─────────┐  TDX-attested   ┌─────────────────────────────────────────┐\n│  Client │◄────────────────►│  GCP Confidential Space CVM (TDX)      │\n└─────────┘  SecureChannel   │  ┌───────────────────────────────────┐  │\n                             │  │  EphemeralML Container             │  │\n                             │  │  - TDX attestation (configfs-tsm)  │  │\n                             │  │  - Inference + receipt signing      │  │\n                             │  │  - Direct HTTPS to GCS / Cloud KMS │  │\n                             │  └───────────────────────────────────┘  │\n                             └─────────────────────────────────────────┘\n                                     │                    │ TDX quote\n                                     │ GCS               ▼\n                              ┌──────┴──────┐     ┌──────────────────┐\n                              │  Encrypted  │     │ Cloud KMS (WIP)  │\n                              │   Models    │     │ (key release)    │\n                              └─────────────┘     └──────────────────┘\n```\n\n### GCP Confidential Space — GPU (a3-highgpu-1g + H100 CC)\n\n```\n┌─────────┐  TDX-attested   ┌──────────────────────────────────────────────┐\n│  Client │◄────────────────►│  GCP Confidential Space CVM (TDX + H100 CC) │\n└─────────┘  SecureChannel   │  ┌────────────────────────────────────────┐  │\n                             │  │  EphemeralML Container (CUDA 12.2)     │  │\n                             │  │  - TDX attestation (configfs-tsm)      │  │\n                             │  │  - GGUF model loaded from GCS          │  │\n                             │  │  - GPU inference (candle-cuda, H100)   │  │\n                             │  │  - Receipt signing (Ed25519)           │  │\n                             │  └────────────────────────────────────────┘  │\n                             └──────────────────────────────────────────────┘\n                                     │                    │ TDX quote\n                                     │ GCS               ▼\n                              ┌──────┴──────┐     ┌──────────────────┐\n                              │  GGUF Model │     │ Cloud KMS (WIP)  │\n                              │  (≤16 GB)   │     │ (key release)    │\n                              └─────────────┘     └──────────────────┘\n```\n\n**Key insight**: Host never has keys. On AWS, it just forwards ciphertext. On GCP, the entire CVM is the trust boundary — no host/enclave split, no VSock. GPU deployments use NVIDIA H100 in CC-mode (attestation confirms `nvidia_gpu.cc_mode: ON`). The pipeline layer (`confidential-ml-pipeline`) orchestrates multi-stage inference with per-stage attestation.\n\n---\n\n## Security Model\n\n### What's Protected\n- ✅ **Model weights** (IP protection)\n- ✅ **Prompts \u0026 outputs** (PII / classified data)\n- ✅ **Execution integrity** (verified code)\n\n### How\n1. **Attestation-gated key release** — KMS releases DEK only if enclave measurements match policy (PCRs on Nitro, MRTD/RTMRs on TDX)\n2. **HPKE encrypted sessions** — end-to-end encryption, host sees only ciphertext\n3. **Ed25519 signed receipts** — cryptographic proof of execution\n4. **Cross-platform transport** — `confidential-ml-transport` handles attestation-bound channels on both VSock (Nitro) and TCP (TDX)\n\n### Threat Model\n- ✓ Compromised host OS → **Protected** (enclave isolation)\n- ✓ Malicious cloud admin → **Protected** (can't decrypt)\n- ✓ Supply chain attack → **Detected** (PCR verification)\n- ✓ Model swap attack → **Prevented** (signed manifests)\n\n---\n\n## Features\n\n### Core (Production Ready)\n- **AWS Nitro Enclave integration** with real NSM attestation and PCR-bound KMS key release\n- **GCP Confidential Space integration** with Intel TDX attestation, MRTD/RTMR measurement pinning, and Cloud KMS key release (`GcpKmsClient` implemented, not yet wired into runtime model-loading path)\n- **Pipeline orchestration** via `confidential-ml-pipeline` — multi-stage inference with per-stage attestation, health checks, and graceful shutdown\n- **Cross-platform transport** via `confidential-ml-transport` — attestation-bound SecureChannel with pluggable TCP/VSock backends\n- **S3 model storage** (AWS) and **GCS model storage** (GCP) with client-side encryption\n\n### Inference Engine\n- **Candle-based** transformer inference (MiniLM, BERT, Llama)\n- **GGUF support** for quantized models (int4, int8) — used for GPU inference (Llama 3 8B Q4_K_M)\n- **CUDA 12.2 GPU inference** via candle-cuda on NVIDIA H100 CC-mode (a3-highgpu-1g)\n- **BF16/safetensors** format enforcement (CPU path)\n- Memory-optimized for TEE constraints\n\n### Security \u0026 Compliance\n- **Attested Execution Receipts** (AER) — Ed25519-signed, CBOR-canonical, binding input/output hashes to enclave attestation\n- **Policy update system** with signature verification and hot-reload\n- **Model format validation** (safetensors, dtype enforcement)\n- **105 tests** across 4 workspace crates (including pipeline integration and GCP tests)\n- **Deterministic builds** for reproducibility\n\n---\n\n## Performance\n\nMeasured on AWS EC2 m6i.xlarge (4 vCPU, 16GB RAM) with MiniLM-L6-v2 (22.7M params), 3 independent runs of 100 iterations each. Commit `b00bab1`. Paper (\\S7) uses canonical release-gate data from commit `057a85a`. Raw JSON available in [GitHub Releases](https://github.com/cyntrisec/EphemeralML/releases).\n\n### Inference Overhead\n\n| Metric | Bare Metal | Nitro Enclave | Overhead |\n|--------|-----------|---------------|----------|\n| Mean latency | 78.55ms | 88.45ms | **+12.6%** |\n| P95 latency | 79.09ms | 89.58ms | +13.3% |\n| Throughput | 12.73 inf/s | 11.31 inf/s | -11.2% |\n\n### Cold Start Breakdown\n\n| Stage | Time |\n|-------|------|\n| NSM Attestation | 88ms |\n| KMS Key Release | 76ms |\n| Model Fetch (S3→VSock) | 6,716ms |\n| Model Decrypt + Load | 139ms |\n| **Total** | **7,052ms** |\n\n### Security Primitives\n\n| Operation | Latency | Frequency |\n|-----------|---------|-----------|\n| COSE attestation verification | 3.012ms | Once per session |\n| HPKE session setup | 0.10ms | Once per session |\n| HPKE encrypt + decrypt (1KB) | 0.006ms | Per inference |\n| Receipt sign (CBOR + Ed25519) | 0.022ms | Per inference |\n| **Total per-inference crypto** | **0.028ms** | Per inference |\n\n### E2E Encrypted Request Overhead\n\n| Component | Latency |\n|-----------|---------|\n| Per-request crypto (encrypt+decrypt+receipt) | 0.164ms |\n| Session setup (keygen+HPKE) | 0.138ms |\n| TCP handshake (ClientHello→ServerHello→HPKE) | 0.153ms |\n\n### Concurrency Scaling (bare metal, m6i.xlarge)\n\n| Threads | Throughput | Mean Latency | Scaling Efficiency |\n|---------|-----------|-------------|-------------------|\n| 1 | 12.75 inf/s | 78ms | 100% |\n| 2 | 14.73 inf/s | 136ms | 57.8% |\n| 4 | 14.66 inf/s | 270ms | 28.8% |\n| 8 | 14.57 inf/s | 546ms | 14.3% |\n\n### Cost Analysis (m6i.xlarge @ $0.192/hr)\n\n| Metric | Bare Metal | Enclave |\n|--------|-----------|---------|\n| Cost per 1M inferences | $4.19 | $4.72 |\n| Enclave cost multiplier | — | 1.13x |\n\n### Key Findings\n\n- **~12.6% inference overhead** — on par with AMD SEV-SNP BERT numbers (~16%), competitive with SGX/TDX\n- **Latest 3-model campaign (2026-02-05)** — weighted mean overhead **+12.9%** (MiniLM-L6 +14.0%, MiniLM-L12 +12.9%, BERT-base +11.9%)\n- **Embedding quality preserved** — near-identical embeddings (cosine similarity ≈ 1.0; tiny FP-level differences expected across CPU allocations)\n- **Per-inference crypto cost negligible** — 0.028ms vs 88ms inference (0.03%)\n- **E2E crypto overhead** — 0.164ms per request (0.19% of inference time)\n- **Throughput plateaus at ~14.7 inf/s** — CPU-bound on 2 vCPUs; latency scales linearly with concurrency\n- **$4.72 per 1M inferences** in enclave (1.13x bare metal cost)\n- **First published per-inference latency benchmark on AWS Nitro Enclaves**\n\n### GPU Performance (GCP Confidential Space, H100 CC-mode)\n\nMeasured on GCP a3-highgpu-1g (1x NVIDIA H100, TDX CC-mode ON) with Llama 3 8B Q4_K_M GGUF (4.6GB fetched from GCS at runtime).\n\n| Metric | Value |\n|--------|-------|\n| Model | Llama 3 8B Q4_K_M (GGUF, 4.6GB) |\n| Machine | a3-highgpu-1g (1x H100, TDX) |\n| Boot to ready | ~3.5 min |\n| 50 tokens generated | 12s (241ms/token) |\n| Attestation | TDX quote, `nvidia_gpu.cc_mode: ON` |\n| Receipt | Ed25519-signed, CBOR-canonical |\n\n**Critical**: GCP Confidential Space GPU uses cos-gpu-installer v2.5.3, which installs driver 535.247.01. This driver supports CUDA \u003c= 12.2 only. Using CUDA 12.6+ fails with `CUDA_ERROR_UNSUPPORTED_PTX_VERSION`. The `Dockerfile.gpu` must use `nvidia/cuda:12.2.2-devel-ubuntu22.04` as the base image.\n\nSee [`docs/benchmarks.md`](docs/benchmarks.md) for methodology, competitive analysis, and literature comparison.\n\n### KMS Attestation Audit Results\n\nVerified on real Nitro hardware (m6i.xlarge, Feb 2026) using a KMS key with `kms:RecipientAttestation:ImageSha384` condition and key-policy-only evaluation (no root account statement, no IAM bypass path).\n\n**Debug vs non-debug mode:** Enclaves launched with `--debug-mode` have all PCR values zeroed in their attestation documents. PCR-conditioned KMS policies cannot match in debug mode — the condition compares the policy's PCR0 hash against all-zeros, which never matches. Production (non-debug) enclaves carry real PCR values derived from the EIF contents.\n\n**PCR0 enforcement evidence (non-debug mode):**\n\n| Scenario | Result |\n|----------|--------|\n| Correct PCR0, valid attestation | Success (key released) |\n| Wrong PCR0, valid attestation | `AccessDeniedException` |\n| No attestation (recipient absent) | `AccessDeniedException` |\n| Malformed attestation (random bytes) | `ValidationException` |\n| Bit-flipped attestation (1 byte changed) | `ValidationException` |\n\nCloudTrail confirms non-zero `attestationDocumentEnclaveImageDigest` for successful calls and no recipient data for denied calls.\n\n**Replay semantics:** KMS accepts replayed attestation documents — resubmitting a previously successful attestation doc produces another successful key release. KMS validates the COSE_Sign1 signature and PCR values but does not enforce freshness (no nonce binding or timestamp check on the attestation document itself).\n\n### Final Benchmark Release Gate (KMS-Enforced)\n\nUse the single-command gate on your Nitro EC2 instance:\n\n```bash\n./scripts/final_release_gate.sh --runs 3 --model-id minilm-l6\n```\n\nThis chains:\n1. `scripts/run_final_kms_validation.sh` with `--require-kms`\n2. `scripts/check_kms_integrity.sh` against produced `run_*` directories\n3. Final manifest + summary output\n\nFor ad-hoc auditing of existing result directories:\n\n```bash\n./scripts/check_kms_integrity.sh benchmark_results_final/kms_validation_*/run_*\n```\n\n### Publish Public Artifact (Reader-Friendly)\n\nTo publish benchmark evidence without requiring reader AWS access:\n\n```bash\n# 1) Package + scan for sensitive markers\n./scripts/prepare_public_artifact.sh \\\n  --input-dir benchmark_results_final/kms_validation_20260205_234917 \\\n  --name kms_validation_20260205_234917.tar.gz\n\n# 2) Upload to a GitHub Release tag\n./scripts/publish_public_artifact.sh \\\n  --tag v1.0.0 \\\n  --artifact artifacts/public/kms_validation_20260205_234917.tar.gz\n```\n\nSee [`docs/ARTIFACT_PUBLICATION.md`](docs/ARTIFACT_PUBLICATION.md) for full details.\n\n---\n\n## Quick Start\n\n### Local Demo (Mock Mode)\n\nRun a working end-to-end demo locally — loads MiniLM-L6-v2, sends text, gets 384-dim embeddings + a signed Attested Execution Receipt:\n\n```bash\nbash scripts/demo.sh\n```\n\nOr manually:\n\n```bash\n# Terminal 1: Start enclave with model\ncargo run --release --features mock --bin ephemeral-ml-enclave -- \\\n    --model-dir test_assets/minilm --model-id stage-0\n\n# Terminal 2: Run host inference\ncargo run --release --features mock --bin ephemeral-ml-host\n```\n\n### Production (AWS Nitro Enclaves)\n\nPrerequisites: AWS account with Nitro Enclave support, Rust 1.75+, Terraform.\n\n```bash\n# 1. Provision infrastructure\ncd infra/hello-enclave\nterraform init \u0026\u0026 terraform apply\n\n# 2. Build enclave image\ndocker build -f enclave/Dockerfile.enclave -t ephemeral-ml-enclave .\nnitro-cli build-enclave --docker-uri ephemeral-ml-enclave:latest --output-file enclave.eif\n\n# 3. Run\nnitro-cli run-enclave --eif-path enclave.eif --cpu-count 2 --memory 4096\n```\n\n### Production (GCP Confidential Space — CPU)\n\nPrerequisites: GCP project with Confidential Computing API enabled, c3-standard-4 (TDX), Rust 1.75+.\n\n```bash\n# Build for GCP (no mock, no default features)\ncargo build --release --no-default-features --features gcp -p ephemeral-ml-enclave\n\n# Run on CVM (--gcp flag required to enter GCP code path)\n./target/release/ephemeral-ml-enclave \\\n    --gcp --model-dir /app/model --model-id stage-0\n```\n\n### Production (GCP Confidential Space — GPU)\n\nPrerequisites: GCP project with a3-highgpu-1g quota, NVIDIA H100 CC-mode. Requires CUDA 12.2 (not 12.6+).\n\n```bash\n# Build GPU container (CUDA 12.2 base — required for CS driver 535.x)\ndocker build -f Dockerfile.gpu -t ephemeral-ml-gpu .\n\n# Deploy to Confidential Space with GPU\nbash scripts/gcp/deploy.sh --gpu \\\n    --model-source gcs \\\n    --model-format gguf\n```\n\nExpected boot timeline: ~3.5 min (image pull + cos-gpu-installer + model fetch from GCS). Llama 3 8B Q4_K_M generates 50 tokens in 12s.\n\nSee [`QUICKSTART.md`](QUICKSTART.md) and [`docs/build-matrix.md`](docs/build-matrix.md) for detailed instructions.\n\n---\n\n## Project Status\n\n| Component | Status | Tests |\n|-----------|--------|-------|\n| Pipeline Orchestrator | ✅ Production | 10 |\n| Stage Executor | ✅ Production | 1 |\n| NSM Attestation (AWS) | ✅ Production | 11 |\n| TDX Attestation (GCP) | ✅ Production | — |\n| KMS Integration (AWS) | ✅ Production | — |\n| GCP KMS / WIP | ⚠ Code exists, not wired into runtime | — |\n| Inference Engine (Candle) | ✅ Production | 4 |\n| Receipt Signing (Ed25519) | ✅ Production | 6 |\n| Common / Types | ✅ Production | 42 |\n| Host / Client | ✅ Production | 4 |\n| Degradation Policies | ✅ Production | 3 |\n| GCS Model Loader | ✅ Implemented | — |\n| GPU Inference (H100 CC, CUDA 12.2) | ✅ Verified on hardware | — |\n| TDX Verifier Bridge (Client) | ✅ Implemented | — |\n\n**v3.1 GPU Confidential** — GPU inference on GCP Confidential Space (a3-highgpu-1g, NVIDIA H100 CC-mode) with Llama 3 8B Q4_K_M GGUF, CUDA 12.2, TDX attestation, and Ed25519-signed receipts. GCS loader supports up to 16GB models with Content-Length pre-check. 105 tests passing.\n\n---\n\n## Documentation\n\n- [`docs/design.md`](docs/design.md) — Architecture \u0026 threat model\n- [`docs/build-matrix.md`](docs/build-matrix.md) — Deployment modes, feature flags \u0026 build commands (AWS, GCP, mock)\n- [`docs/benchmarks.md`](docs/benchmarks.md) — Benchmark methodology, results \u0026 competitive analysis\n- [`docs/BENCHMARK_SPEC.md`](docs/BENCHMARK_SPEC.md) — Benchmark specification (11-paper literature review)\n- [`QUICKSTART.md`](QUICKSTART.md) — Deployment guide\n- [`SECURITY_DEMO.md`](SECURITY_DEMO.md) — Security walkthrough\n- [`scripts/run_final_kms_validation.sh`](scripts/run_final_kms_validation.sh) — Multi-run KMS-enforced benchmark validation\n- [`scripts/check_kms_integrity.sh`](scripts/check_kms_integrity.sh) — Post-run KMS/commit/hardware integrity audit\n- [`scripts/final_release_gate.sh`](scripts/final_release_gate.sh) — Single-command release gate for benchmark artifacts\n\n---\n\n## License\n\nApache 2.0 — see [LICENSE](LICENSE)\n\n---\n\n\u003cdiv align=\"center\"\u003e\n\n**Run inference like the host is already hacked.**\n\n[Documentation](docs/) • [Benchmarks](docs/benchmarks.md) • [Issues](https://github.com/cyntrisec/EphemeralML/issues)\n\n\u003c/div\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcyntrisec%2Fephemeralml","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcyntrisec%2Fephemeralml","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcyntrisec%2Fephemeralml/lists"}