https://github.com/kamb-code/sha256-r19-preimage
Oracle-free preimage attack on 19-round reduced SHA-256 — paper, solver, and independent verifier
https://github.com/kamb-code/sha256-r19-preimage
cryptanalysis cryptography cuda gpu hash-functions preimage-attack security-research sha256
Last synced: 3 days ago
JSON representation
Oracle-free preimage attack on 19-round reduced SHA-256 — paper, solver, and independent verifier
- Host: GitHub
- URL: https://github.com/kamb-code/sha256-r19-preimage
- Owner: kamb-code
- Created: 2026-06-16T09:39:57.000Z (14 days ago)
- Default Branch: main
- Last Pushed: 2026-06-18T19:36:27.000Z (12 days ago)
- Last Synced: 2026-06-18T20:19:45.325Z (12 days ago)
- Topics: cryptanalysis, cryptography, cuda, gpu, hash-functions, preimage-attack, security-research, sha256
- Language: Python
- Size: 1.45 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Oracle-Free 19-Round SHA-256 Preimage — Reproducibility Package
This folder contains everything needed to verify and reproduce the results in:
> **"Oracle-Free Preimage Attack on 19-Round Reduced SHA-256"**
> `paper_r19_final.pdf` / `paper_r19_final.tex`
---
## What is claimed
An oracle-free preimage attack on the **19-round reduced SHA-256 compression function**
initialized with the standard IV. Given only a 32-byte target hash, the solver
finds 16 arbitrary 32-bit message words W[0..15] such that:
```
SHA256_compress_19(IV, W[0..15]) + IV == target_hash
```
No padding constraint is imposed on W[0..15]. This is a claim about the compression
function, not standard padded SHA-256.
Three independently verified preimages are included (`verified_preimages.txt`).
---
## Quick verification (no GPU, no lookup table needed)
Requirements: Python 3.8+, no third-party packages.
```bash
# Verify P1
python3 code/verify_r19.py --rounds 19 \
--hash 1e65261c54255188604f5375091839733de63e966b5e4715658226bf03588447 \
--words "22f091af ec52d67b 74c33819 a280dc6a b001ff1a 1f2356a5 3eccf108 bd9a2333 \
abe611d1 6d1e5a20 8041df25 e43d31af aa895a2e 69106ad2 7479fa3a 2a9abb91"
# Verify P2
python3 code/verify_r19.py --rounds 19 \
--hash fb52f81baed24f8728faf5bbce82c67d510761172fb9876d9e3a72dda351b7ca \
--words "37e6702f bc20efea 2dd42a3e 501dfbe9 3cacc578 ea2de1c1 11c0f066 0f22be47 \
2a447d2d 13f0080f 1f33df6b d655d8e6 15730eaa 9bf64950 9f129973 5a964edf"
# Verify P3
python3 code/verify_r19.py --rounds 19 \
--hash 1bd7ebbdc4d938fb26d19b5dd5caf333de397bd1c745727bd5556baf38ccf977 \
--words "3ce8fba4 e2fb9661 44730c59 e1cf4bc0 e1a18d93 97658983 67efe2a7 ef260ecb \
d4c6dbe0 13e9388e 95664a59 4d9e248b 74137862 664815ac 89eae95a cd7dbef5"
```
All three should print `result: OK`.
The verifier (`code/verify_r19.py`) is self-contained and deliberately does not
import any attack code.
---
## Running the solver on your own target hash (requires NVIDIA GPU + CUDA)
Requirements: Python 3.10+, NumPy, tqdm, a CUDA-enabled PyTorch build, NVIDIA GPU
(H100 recommended). A CPU-only PyTorch install will not run the solver.
```bash
pip install numpy tqdm
# Install PyTorch using the CUDA wheel appropriate for your driver/CUDA setup.
# Attack a specific 19-round target hash of your choice:
python3 code/h100_extended.py --hash <64-hex-char-hash>
# Example (reproduces P1):
python3 code/h100_extended.py \
--hash 1e65261c54255188604f5375091839733de63e966b5e4715658226bf03588447
# Run with default random targets (benchmark / statistical mode):
python3 code/h100_extended.py
# Full options:
python3 code/h100_extended.py --help
```
The solver:
1. Builds the σ₀(u)−u representative table in GPU memory (~16 GB, ~1.8 s on H100).
2. Runs the backward chain on the target hash to fix the high state words.
3. Samples random contexts for the free state words a[4..10].
4. Sweeps a[0] over 2³² values per context using the C0/C1/C2 cancellation chain.
5. Prints any found preimage to stdout and saves it to a `.txt` file.
No precomputed table file is required — the table is rebuilt from scratch each run.
Expected time to first hit on H100: a few minutes to ~15 minutes (stochastic).
**Generate your own target to attack:**
Choose any 16 input words and run the verifier without `--hash`; it will print
the corresponding 19-round target hash. Then pass that hash to the solver.
```bash
python3 code/verify_r19.py --rounds 19 \
--words "00000000 00000001 00000002 00000003 00000004 00000005 00000006 00000007 \
00000008 00000009 0000000a 0000000b 0000000c 0000000d 0000000e 0000000f"
```
---
## Does padding matter? Why W[0..15] are unconstrained here
Standard SHA-256 appends a specific padding to each message before hashing:
a `0x80` byte, then zeros, then the 8-byte message bit-length. For a message
shorter than 56 bytes, this all fits in one 512-bit block, so several of the
16 input words are fixed by the padding format.
**This attack does not use standard padding.** W[0..15] are treated as 16
arbitrary 32-bit words — no structure is required. This gives the solver the
maximum possible freedom to satisfy the 19-round equations.
**What this means in practice:**
| Scenario | Applies? |
|---|---|
| "Find W[0..15] s.t. 19-round compress(IV,W)+IV = T" (this paper) | ✅ Yes |
| "Find a padded message s.t. standard SHA-256(msg) = T" (real preimage) | ❌ Not directly |
To attack a padded message preimage you would need to additionally satisfy
the padding constraints (e.g. W[14]=0, W[15]=bit_length, specific 0x80 byte).
That reduces the attacker's free variables from 16 to roughly 13–14, and
propagates constraints into the schedule words (W[16], W[17], ...) that the
attack depends on. A padded-message variant would require extending the method
to handle those fixed values — this is not done here and remains an open problem.
**The bottom line:** the method is the same (backward chain + σ₀-differential
table + C0/C1/C2 cancellations), but padding constraints would require
additional work to accommodate. The security of full 64-round padded SHA-256
is not affected by this result.
---
## File listing
```
publish/
README.md — this file
paper_r19_final.pdf — compiled paper (21 pages)
paper_r19_final.tex — LaTeX source
verified_preimages.txt — the three verified preimage examples
code/ — core reproducibility files:
verify_r19.py — standalone verifier (no dependencies beyond stdlib)
h100_extended.py — production GPU solver (PyTorch + CUDA)
extended_solver.py — backward chain + W recovery utilities
sha256_core.py — SHA-256 full-trace reference implementation
utils.py — SHA-256 primitives (ROTR, Σ, σ, Ch, Maj, H0, K)
auxiliary research scripts (not needed to reproduce):
absorption_analysis.py — multi-block coordinate descent experiments
alt_differential.py — alternate differential experiments
angle_analysis.py — differential angle analysis
block1_coord.py — single-block coordinate descent
block2_coord.py — two-block coordinate descent
cuda_sweep.py — CUDA birthday sweep for near-collisions
deep_search.py — extended birthday search
differential_trace.py — differential trace logging
final_results.py — result aggregation
gpu_sa.py — GPU simulated annealing (experimental)
near_collision_result.py — near-collision result logging
schedule_differential.py — schedule differential analysis
sensitivity_matrix.py — sensitivity matrix computation
threeblock_coord.py — three-block coordinate descent
twobit_search.py — two-bit differential search
twoblock_sweep.py — two-block birthday sweep
zero_window_lemma.py — zero-window lemma verification
```
Note: the auxiliary scripts import from `/home/administrator/sha/sha256` (the original
development tree) and will not run on a fresh clone. They are included for
transparency only — all results claimed in the paper use the five core files above.
---
## Key algebraic identities (see paper §4–§5)
**Lemma 1 (W9 differential).**
Define Ŵ₉ = W₉_sched(a₁=0). Then Ŵ₉ − W₉_real = a₁.
**Proposition 1 (C0 cancellation).**
F₀ = W₁₆_bc − σ₁(W₁₄) − g(a₀) − Ŵ₉ − a₀ − C₀_const = σ₀(W₁) − W₁.
So a₁ cancels and the σ₀-differential table recovers W₁, hence a₁ = W₁ − g(a₀).
**Proposition 2 (C1/C2 cancellations).**
In C1 the target unknown a₂ cancels; in C2 the target unknown a₃ cancels.
Both reduce to σ₀-differential table lookups.
These three identities together make the full 2³² sweep tractable on a single GPU.
---
## Attack model disclaimer
This result concerns only the one-block reduced-round SHA-256 **compression function**
(19 of 64 rounds), initialized with the standard IV and with no padding constraint on
the 16 input words. It is **not** a preimage attack on standard padded SHA-256.
The oracle-free 20-round case remains open (see paper §7).