{"id":44656088,"url":"https://github.com/shrec/ultrafastsecp256k1","last_synced_at":"2026-06-04T01:00:50.304Z","repository":{"id":337883067,"uuid":"1148013745","full_name":"shrec/UltrafastSecp256k1","owner":"shrec","description":"Ultra high-performance secp256k1 ECC engine | Python, Node.js, Rust, Go, C#, Swift, Java bindings | CUDA, Metal, OpenCL GPU | ECDSA, Schnorr, FROST, MuSig2, BIP-352 | 15+ platforms","archived":false,"fork":false,"pushed_at":"2026-05-31T15:13:30.000Z","size":482478,"stargazers_count":39,"open_issues_count":1,"forks_count":16,"subscribers_count":2,"default_branch":"main","last_synced_at":"2026-05-31T15:23:43.498Z","etag":null,"topics":["android","arm64","bitcoin","constant-time","cryptography","cuda","ecdsa","embedded","ethereum","ffi","gpu-cryptography","ios","nodejs","opencl","python","riscv","rust","schnorr-signatures","secp256k1","webassembly"],"latest_commit_sha":null,"homepage":"https://shrec.github.io/UltrafastSecp256k1/","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/shrec.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":"audit/AUDIT_TEST_PLAN.md","citation":"CITATION.cff","codeowners":".github/CODEOWNERS","security":"SECURITY.md","support":null,"governance":"docs/GOVERNANCE.md","roadmap":"docs/ROADMAP.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":".zenodo.json","notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null},"funding":{"github":["shrec"],"custom":["https://paypal.me/IChkheidze","https://stacker.news/shrec"]}},"created_at":"2026-02-02T13:29:44.000Z","updated_at":"2026-05-31T08:58:25.000Z","dependencies_parsed_at":null,"dependency_job_id":"528e899d-174a-4fa4-9fa2-27a2f521789e","html_url":"https://github.com/shrec/UltrafastSecp256k1","commit_stats":null,"previous_names":["shrec/ultrafastsecp256k1"],"tags_count":43,"template":false,"template_full_name":null,"purl":"pkg:github/shrec/UltrafastSecp256k1","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shrec%2FUltrafastSecp256k1","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shrec%2FUltrafastSecp256k1/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shrec%2FUltrafastSecp256k1/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shrec%2FUltrafastSecp256k1/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/shrec","download_url":"https://codeload.github.com/shrec/UltrafastSecp256k1/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shrec%2FUltrafastSecp256k1/sbom","scorecard":{"id":1243930,"data":{"date":"2026-02-24T09:10:15Z","repo":{"name":"github.com/shrec/UltrafastSecp256k1","commit":"646cd844d1519bc828c22ac34bdab22d015ad83d"},"scorecard":{"version":"v5.3.0","commit":"c22063e786c11f9dd714d777a687ff7c4599b600"},"score":7.3,"checks":[{"name":"CI-Tests","score":-1,"reason":"no pull request found","details":null,"documentation":{"short":"Determines if the project runs tests before pull requests are merged.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#ci-tests"}},{"name":"Dependency-Update-Tool","score":10,"reason":"update tool detected","details":["Info: detected update tool: Dependabot: .github/dependabot.yml:1"],"documentation":{"short":"Determines if the project uses a dependency update tool.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#dependency-update-tool"}},{"name":"Code-Review","score":0,"reason":"Found 0/30 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#code-review"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#binary-artifacts"}},{"name":"Maintained","score":0,"reason":"project was created within the last 90 days. Please review its contents carefully","details":["Warn: Repository was created within the last 90 days."],"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#maintained"}},{"name":"Security-Policy","score":10,"reason":"security policy file detected","details":["Info: security policy file detected: SECURITY.md:1","Info: Found linked content: SECURITY.md:1","Info: Found disclosure, vulnerability, and/or timelines in security policy: SECURITY.md:1","Info: Found text in security policy: SECURITY.md:1"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#security-policy"}},{"name":"Dangerous-Workflow","score":10,"reason":"no dangerous workflow patterns detected","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#dangerous-workflow"}},{"name":"Token-Permissions","score":10,"reason":"GitHub workflow tokens follow principle of least privilege","details":["Warn: jobLevel 'contents' permission set to 'write': .github/workflows/benchmark.yml:28","Warn: jobLevel 'deployments' permission set to 'write': .github/workflows/benchmark.yml:29","Info: jobLevel 'contents' permission set to 'read': .github/workflows/codeql.yml:24","Info: jobLevel 'actions' permission set to 'read': .github/workflows/codeql.yml:25","Warn: jobLevel 'contents' permission set to 'write': .github/workflows/release.yml:1517","Info: jobLevel 'actions' permission set to 'read': .github/workflows/scorecard.yml:20","Info: jobLevel 'contents' permission set to 'read': .github/workflows/scorecard.yml:21","Info: topLevel 'contents' permission set to 'read': .github/workflows/benchmark.yml:21","Info: topLevel 'contents' permission set to 'read': .github/workflows/bindings.yml:30","Info: topLevel 'contents' permission set to 'read': .github/workflows/ci.yml:10","Info: topLevel 'contents' permission set to 'read': .github/workflows/clang-tidy.yml:20","Info: topLevel 'contents' permission set to 'read': .github/workflows/codeql.yml:16","Info: topLevel 'contents' permission set to 'read': .github/workflows/dependency-review.yml:13","Info: topLevel 'contents' permission set to 'read': .github/workflows/docs.yml:18","Info: topLevel 'contents' permission set to 'read': .github/workflows/nightly.yml:19","Info: topLevel 'contents' permission set to 'read': .github/workflows/packaging.yml:23","Info: topLevel permissions set to 'read-all': .github/workflows/release.yml:13","Info: topLevel permissions set to 'read-all': .github/workflows/scorecard.yml:11","Info: topLevel 'contents' permission set to 'read': .github/workflows/security-audit.yml:13","Info: topLevel 'contents' permission set to 'read': .github/workflows/sonarcloud.yml:15"],"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#token-permissions"}},{"name":"SAST","score":10,"reason":"SAST tool detected: CodeQL","details":["Info: SAST configuration detected: CodeQL","Warn: no pull requests merged into dev branch"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#sast"}},{"name":"CII-Best-Practices","score":7,"reason":"badge detected: Silver","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#cii-best-practices"}},{"name":"Pinned-Dependencies","score":10,"reason":"all dependencies are pinned","details":["Info: Possibly incomplete results: error parsing shell code: invalid parameter name: .github/workflows/release.yml:88","Info: 116 out of 116 GitHub-owned GitHubAction dependencies pinned","Info:  77 out of  77 third-party GitHubAction dependencies pinned","Info:   2 out of   2 pipCommand dependencies pinned","Info:   2 out of   2 containerImage dependencies pinned"],"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#pinned-dependencies"}},{"name":"Signed-Releases","score":0,"reason":"Project has not signed or included provenance with any releases.","details":["Warn: release artifact v3.13.0 not signed: https://api.github.com/repos/shrec/UltrafastSecp256k1/releases/289689414","Warn: release artifact v3.12.3 not signed: https://api.github.com/repos/shrec/UltrafastSecp256k1/releases/289682145","Warn: release artifact v3.12.2 not signed: https://api.github.com/repos/shrec/UltrafastSecp256k1/releases/289675224","Warn: release artifact v3.12.1 not signed: https://api.github.com/repos/shrec/UltrafastSecp256k1/releases/289498184","Warn: release artifact v3.12.0 not signed: https://api.github.com/repos/shrec/UltrafastSecp256k1/releases/289482092","Warn: release artifact v3.13.0 does not have provenance: https://api.github.com/repos/shrec/UltrafastSecp256k1/releases/289689414","Warn: release artifact v3.12.3 does not have provenance: https://api.github.com/repos/shrec/UltrafastSecp256k1/releases/289682145","Warn: release artifact v3.12.2 does not have provenance: https://api.github.com/repos/shrec/UltrafastSecp256k1/releases/289675224","Warn: release artifact v3.12.1 does not have provenance: https://api.github.com/repos/shrec/UltrafastSecp256k1/releases/289498184","Warn: release artifact v3.12.0 does not have provenance: https://api.github.com/repos/shrec/UltrafastSecp256k1/releases/289482092"],"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#signed-releases"}},{"name":"Fuzzing","score":10,"reason":"project is fuzzed","details":["Info: CppLibFuzzer integration found: cpu/fuzz/fuzz_field.cpp:17","Info: CppLibFuzzer integration found: cpu/fuzz/fuzz_point.cpp:18","Info: CppLibFuzzer integration found: cpu/fuzz/fuzz_scalar.cpp:17"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#fuzzing"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#vulnerabilities"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: GNU Affero General Public License v3.0: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#license"}},{"name":"Packaging","score":10,"reason":"packaging workflow detected","details":["Info: Project packages its releases by way of GitHub Actions.: .github/workflows/release.yml:639"],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#packaging"}},{"name":"Branch-Protection","score":-1,"reason":"internal error: error during branchesHandler.setup: internal error: some github tokens can't read classic branch protection rules: https://github.com/ossf/scorecard-action/blob/main/docs/authentication/fine-grained-auth-token.md","details":null,"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#branch-protection"}},{"name":"Contributors","score":0,"reason":"project has 0 contributing companies or organizations -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project has a set of contributors from multiple organizations (e.g., companies).","url":"https://github.com/ossf/scorecard/blob/c22063e786c11f9dd714d777a687ff7c4599b600/docs/checks.md#contributors"}}]},"last_synced_at":"2026-02-24T09:13:09.175Z","repository_id":337883067,"created_at":"2026-02-24T09:13:09.212Z","updated_at":"2026-02-24T09:13:09.212Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33886153,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-03T02:00:06.370Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["android","arm64","bitcoin","constant-time","cryptography","cuda","ecdsa","embedded","ethereum","ffi","gpu-cryptography","ios","nodejs","opencl","python","riscv","rust","schnorr-signatures","secp256k1","webassembly"],"created_at":"2026-02-14T22:05:19.324Z","updated_at":"2026-06-04T01:00:50.295Z","avatar_url":"https://github.com/shrec.png","language":"C++","funding_links":["https://github.com/sponsors/shrec","https://paypal.me/IChkheidze","https://stacker.news/shrec"],"categories":[],"sub_categories":[],"readme":"# UltrafastSecp256k1\n\nUltrafastSecp256k1 is a high-performance, multi-backend secp256k1 engine with reproducible audit evidence, compatibility shims, and profile-based review scopes.\n\nIt is not a trust request. It is a verification package.\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"docs/repo_map_manifest.json\"\u003e\n    \u003cimg src=\"docs/assets/repo-map.svg\" alt=\"UltrafastSecp256k1 repository map: product profiles, scope boundaries, and primary paths\" width=\"100%\"\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"docs/ARCHITECTURE.md\"\u003e\n    \u003cimg src=\"docs/assets/ARCHITECTURE.svg\" alt=\"UltrafastSecp256k1 architecture: CPU engine, GPU engine, embedded, shim, bindings, and CAAS layers\" width=\"100%\"\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n[![Gate](https://github.com/shrec/UltrafastSecp256k1/actions/workflows/gate.yml/badge.svg?branch=dev)](https://github.com/shrec/UltrafastSecp256k1/actions/workflows/gate.yml)\n[![CI](https://github.com/shrec/UltrafastSecp256k1/actions/workflows/ci.yml/badge.svg?branch=dev)](https://github.com/shrec/UltrafastSecp256k1/actions/workflows/ci.yml)\n[![Security Audit](https://github.com/shrec/UltrafastSecp256k1/actions/workflows/security-audit.yml/badge.svg?branch=dev)](https://github.com/shrec/UltrafastSecp256k1/actions/workflows/security-audit.yml)\n[![CAAS](https://github.com/shrec/UltrafastSecp256k1/actions/workflows/caas.yml/badge.svg?branch=dev)](https://github.com/shrec/UltrafastSecp256k1/actions/workflows/caas.yml)\n[![SonarCloud](https://sonarcloud.io/api/project_badges/measure?project=shrec_UltrafastSecp256k1\u0026metric=alert_status)](https://sonarcloud.io/summary/new_code?id=shrec_UltrafastSecp256k1)\n[![CodeQL](https://github.com/shrec/UltrafastSecp256k1/actions/workflows/codeql.yml/badge.svg?branch=dev)](https://github.com/shrec/UltrafastSecp256k1/actions/workflows/codeql.yml)\n[![OSSF Scorecard](https://api.securityscorecards.dev/projects/github.com/shrec/UltrafastSecp256k1/badge)](https://securityscorecards.dev/viewer/?uri=github.com/shrec/UltrafastSecp256k1)\n[![DOI](https://img.shields.io/badge/DOI-10.5281%2Fzenodo.19685027-blue.svg)](https://doi.org/10.5281/zenodo.19685027)\n\n\u003e All CI badges track the `dev` branch — the active development branch where all work happens. Releases are tagged on `main` only when explicitly authorized by the repository owner.\n\n---\n\n## What this repository contains\n\n- **Core engine** — CPU/GPU/embedded secp256k1 implementation (`src/cpu`, `src/cuda`, `src/opencl`, `src/metal`).\n- **Compatibility shims** — opt-in API-compatible paths for existing projects (`compat/libsecp256k1_shim`, `compat/libsecp256k1_bchn_shim`; not API-identical — see FAQ §drop-in for migration notes).\n- **Bindings/FFI** — language integration surfaces (`bindings/`): C, Python, Node.js, Go, Swift, Rust, Java, Dart, C#, WASM, Android.\n- **CAAS** — continuous audit and evidence system (`audit/`, `ci/`); not runtime code.\n- **Reviewer docs** — scoped evidence, known limitations, replay commands (`docs/`).\n\n\u003e **Bitcoin Core PR candidate — scope:** the proposed Bitcoin Core integration is the\n\u003e **CPU ECDSA / Schnorr (BIP-340) verify + sign backend only** — a compile-time *secondary*\n\u003e backend selected behind the existing libsecp256k1, **not a replacement**. Explicitly\n\u003e **out of scope** for that PR (and reviewed/used separately): the GPU backends\n\u003e (CUDA / Metal / OpenCL), WASM, embedded (ESP32/STM32) targets, the non-C++ bindings, and\n\u003e the protocol extensions (FROST, MuSig2, adaptor signatures, ECIES, BIP-352). Those are\n\u003e real features of the wider engine but are not part of the Core backend candidate.\n\n---\n\n## For Bitcoin Core Reviewers\n\n**Scope:** CPU secp256k1 backend only — ECDSA/Schnorr sign/verify, RFC 6979 nonce, DER parsing, constant-time signing, libsecp256k1-compatible shim. GPU, FFI, bindings, WASM, ZK, multi-coin, and wallet tooling are out of scope for this evaluation.\n\n**NOT A REPLACEMENT.** This PR adds an opt-in compile-time alternative backend (`-DSECP256K1_BACKEND=ultrafast`, default: `bundled`). When bundled, the build is byte-for-byte identical to today. The existing `src/secp256k1/` path and all existing behavior is unchanged.\n\n\u003e **No external third-party security audit has been performed.** All audit evidence is self-generated and independently reproducible via CAAS. See [SECURITY.md](SECURITY.md) §Audit Status.\n\n\u003e **Audit methodology:** CAAS (Continuous Automated Assurance System) — a multi-layer automated audit framework: LLVM ct-verif, Valgrind taint analysis, dudect statistical timing, 419-module unified runner with 270 exploit PoC tests.\n\n**Reproduce from patch (primary — stable):**\n```bash\n# Point UFSECP at an existing UltrafastSecp256k1 clone (absolute path).\n# Required because the patch sits inside docs/ of THIS repo, and on a\n# fresh Bitcoin Core clone src/ultrafast_secp256k1 does not exist yet —\n# `git -C src/ultrafast_secp256k1 ...` would fail before submodule init.\nUFSECP=/absolute/path/to/UltrafastSecp256k1\ngit clone https://github.com/bitcoin/bitcoin \u0026\u0026 cd bitcoin\ngit apply \"$UFSECP/docs/INTEGRATION_PATCH.patch\"\ngit submodule update --init src/ultrafast_secp256k1\ncmake --preset ultrafast-bench   # Release + LTO — required for accurate ConnectBlock numbers\ncmake --build out/build-ultrafast-lto -j$(nproc)\nctest --test-dir out/build-ultrafast-lto -j$(nproc)\n```\n\n**Reproduce from fork (alternative — may be rebased):**\n```bash\n# Fork branch may be rebased; prefer the patch path above for reproducibility.\ngit clone https://github.com/shrec/bitcoin -b feature/ultrafast-secp256k1-backend \u0026\u0026 cd bitcoin\ngit submodule update --init src/ultrafast_secp256k1\ncmake --preset ultrafast-bench   # Release + LTO\ncmake --build out/build-ultrafast-lto -j$(nproc)\nctest --test-dir out/build-ultrafast-lto -j$(nproc)\n```\n\n**CAAS evidence entry point:**\n```bash\npython3 ci/caas_runner.py --profile bitcoin-core-backend --json -o btc.json\n```\n\n→ [`docs/CAAS_REVIEWER_QUICKSTART.md`](docs/CAAS_REVIEWER_QUICKSTART.md) — start here  \n→ [`docs/BITCOIN_CORE_BACKEND_EVIDENCE.md`](docs/BITCOIN_CORE_BACKEND_EVIDENCE.md) — evidence package  \n→ [`docs/DER_PARITY_MATRIX.md`](docs/DER_PARITY_MATRIX.md) — DER/parser parity\n\n**CT signing (CT-vs-CT, production-equivalent, GCC 14.2.0, 2026-05-30):** **~1.33× ECDSA · ~1.26× Schnorr** vs libsecp256k1 (turbo lock CONFIRMED: intel_pstate/no_turbo=1, governor=performance, taskset -c 0 nice -20). Canonical data: [`docs/bench_unified_2026-05-30_gcc14_x86-64.json`](docs/bench_unified_2026-05-30_gcc14_x86-64.json). Full compiler breakdown: [docs/BITCOIN_CORE_BACKEND_EVIDENCE.md §CT Signing](docs/BITCOIN_CORE_BACKEND_EVIDENCE.md).\n\n\u003e **ConnectBlock (primary block-validation workload):** within ±1.5% of libsecp256k1 depending on build configuration.\n\u003e - With Release+LTO (GCC 14.2.0, **required for any positive result — without LTO the result is negative**): **+0.9–1.5%** across ConnectBlock aggregate profiles (AllEcdsa, AllSchnorr, Mixed)\n\u003e - VerifyScriptP2WPKH individual validation: **parity (Ultra ≤0.4% slower, within noise margin)**\n\u003e - Without LTO: **−0.5–1.0%** on all profiles. The earlier ~1.1% deficit was reduced after two\n\u003e   targeted fixes (PERF-002 redundant y²=x³+7 curve-check removal in commit `40697447`, and the\n\u003e   DER parser fast-path replacing the previous Scalar-construct round-trip); residual ~0.5–1.0%\n\u003e   no-LTO deficit is consistent with the size delta of the inlined hot-path\n\u003e   (**2,310 KB Ultra `.text` vs 1,261 KB libsecp256k1 `.text`, 1.83× — measured 2026-05-22**;\n\u003e   see [`docs/SHIM_FOOTPRINT_COMPARISON.md`](docs/SHIM_FOOTPRINT_COMPARISON.md)).\n\u003e   With LTO the cross-TU inliner co-optimises both sides and the deficit flips to a small\n\u003e   advantage. The `bitcoin-core` deployment profile (`cmake --preset bitcoin-core`) strips\n\u003e   FROST/ZK/ECIES/BIP-352/Adaptor/Wallet/Pippenger to save **359 KB `.text`** vs the full\n\u003e   profile; see [`ci/profiles.json`](ci/profiles.json) for the full module set.\n\u003e - Taproot key-path signing (wallet, not ConnectBlock): +10% faster (SignTransactionSchnorr)\n\u003e - Taproot script-path signing (wallet, not ConnectBlock): +35% faster (SignSchnorrWithMerkleRoot)\n\u003e - Canonical data: [`docs/BITCOIN_CORE_BENCH_RESULTS.json`](docs/BITCOIN_CORE_BENCH_RESULTS.json) (measured 2026-05-12, commit `48e7c02f`).\n\u003e - For reproducibility, use the commit SHA in `docs/BITCOIN_CORE_BENCH_RESULTS.json` field `\"backend_commit\"` — do not hardcode a SHA in prose.\n\n---\n\n## Review scope matters\n\nThe full repository is multi-platform and multi-product.\nThe Bitcoin Core evaluation profile is intentionally narrow:\n\n**CPU secp256k1 operations · libsecp256k1-compatible shim · parser/DER parity · nonce/RFC 6979 behavior · constant-time signing evidence · Core test and benchmark evidence.**\n\nGPU, FFI, bindings, WASM, ZK, wallet tooling, and alternate node shims are separate profiles.\n\n→ Scoped audit entry point: [`docs/CAAS_REVIEWER_QUICKSTART.md`](docs/CAAS_REVIEWER_QUICKSTART.md)  \n→ Profile definitions: [`docs/PRODUCT_PROFILES.md`](docs/PRODUCT_PROFILES.md)  \n→ Security claims: [`docs/SECURITY_CLAIMS.md`](docs/SECURITY_CLAIMS.md)\n\n---\n\n## Quick Start\n\n**Build from source**\n```bash\ngit clone https://github.com/shrec/UltrafastSecp256k1.git \u0026\u0026 cd UltrafastSecp256k1\npython3 ci/configure_build.py release\ncmake --build out/release -j\n./out/release/selftest    # Expected: \"ALL TESTS PASSED\"\n```\n\n**Package install**\n\nSee [docs/BUILDING.md](docs/BUILDING.md) for full install instructions.\nBuild from source (all platforms):\n```bash\ngit clone https://github.com/shrec/UltrafastSecp256k1 \u0026\u0026 cd UltrafastSecp256k1\ncmake -S . -B out/release -DCMAKE_BUILD_TYPE=Release \u0026\u0026 cmake --build out/release -j$(nproc)\n```\n\n→ [Full build guide](docs/BUILDING.md) · [API reference](docs/API_REFERENCE.md) · [Platform support](docs/CROSS_PLATFORM_TEST_MATRIX.md)\n\n---\n\n## Where to Start\n\n**New here? Start with one of these:**\n\n| Goal | Entry point |\n|---|---|\n| Independent reviewer / auditor | [docs/AUDITOR_QUICKSTART.md](docs/AUDITOR_QUICKSTART.md) |\n| Bitcoin Core evaluation | [docs/CAAS_REVIEWER_QUICKSTART.md](docs/CAAS_REVIEWER_QUICKSTART.md) |\n| Try to break the system | [docs/ATTACK_GUIDE.md](docs/ATTACK_GUIDE.md) |\n| Understand the security guarantees | [docs/SECURITY_CLAIMS.md](docs/SECURITY_CLAIMS.md) · [docs/AUDIT_TRACEABILITY.md](docs/AUDIT_TRACEABILITY.md) |\n| Replay the audit evidence locally | [docs/CAAS_PROTOCOL.md](docs/CAAS_PROTOCOL.md) |\n| Integrate into your project | [docs/API_REFERENCE.md](docs/API_REFERENCE.md) · [docs/BUILDING.md](docs/BUILDING.md) |\n\n**Full navigation:**\n\n| If you want to… | Go here |\n|---|---|\n| Run the audit | [docs/AUDIT_GUIDE.md](docs/AUDIT_GUIDE.md) |\n| Try to break the system | [docs/ATTACK_GUIDE.md](docs/ATTACK_GUIDE.md) |\n| Understand the guarantees | [docs/AUDIT_TRACEABILITY.md](docs/AUDIT_TRACEABILITY.md) |\n| Audit philosophy \u0026 design rationale | [docs/AUDIT_PHILOSOPHY.md](docs/AUDIT_PHILOSOPHY.md) |\n| Audit methodology specification (CAAS) | [docs/AUDIT_STANDARD.md](docs/AUDIT_STANDARD.md) |\n| Independent reviewer quick start | [docs/AUDITOR_QUICKSTART.md](docs/AUDITOR_QUICKSTART.md) |\n| Historical audit report (v3.9.0 baseline — ⚠ not current state) | [AUDIT_REPORT.md](docs/AUDIT_REPORT.md) |\n| Live audit dashboard | [docs/AUDIT_DASHBOARD.md](docs/AUDIT_DASHBOARD.md) |\n| Exploit PoC test catalog | [docs/EXPLOIT_TEST_CATALOG.md](docs/EXPLOIT_TEST_CATALOG.md) |\n| Exploit coverage map | [docs/EXPLOIT_COVERAGE_MAP.md](docs/EXPLOIT_COVERAGE_MAP.md) |\n| ECDSA edge-case coverage | [docs/ECDSA_EDGE_CASE_COVERAGE.md](docs/ECDSA_EDGE_CASE_COVERAGE.md) |\n| Interop matrix (cross-implementation) | [docs/INTEROP_MATRIX.md](docs/INTEROP_MATRIX.md) |\n| Threat model | [docs/THREAT_MODEL.md](docs/THREAT_MODEL.md) |\n| CAAS protocol (continuous audit) | [docs/CAAS_PROTOCOL.md](docs/CAAS_PROTOCOL.md) |\n| Multi-CI reproducible builds | [docs/MULTI_CI_REPRODUCIBLE_BUILD.md](docs/MULTI_CI_REPRODUCIBLE_BUILD.md) |\n| Supply-chain local parity | [docs/SUPPLY_CHAIN_LOCAL_PARITY.md](docs/SUPPLY_CHAIN_LOCAL_PARITY.md) |\n| Hardware side-channel methodology | [docs/HARDWARE_SIDE_CHANNEL_METHODOLOGY.md](docs/HARDWARE_SIDE_CHANNEL_METHODOLOGY.md) |\n| Compliance stance | [docs/COMPLIANCE_STANCE.md](docs/COMPLIANCE_STANCE.md) |\n| Security autonomy program | [docs/SECURITY_AUTONOMY_PLAN.md](docs/SECURITY_AUTONOMY_PLAN.md) |\n| Research monitor | [docs/RESEARCH_MONITOR.md](docs/RESEARCH_MONITOR.md) |\n| ⚖️ Reviewer role prompts | [docs/REVIEWER_PROMPTS/README.md](docs/REVIEWER_PROMPTS/README.md) |\n| Backend assurance matrix | [docs/BACKEND_ASSURANCE_MATRIX.md](docs/BACKEND_ASSURANCE_MATRIX.md) |\n| CI gating policy | [docs/CI_GATING_POLICY.md](docs/CI_GATING_POLICY.md) |\n| ABI layer routing matrix | [docs/LAYER_ROUTING_MATRIX.md](docs/LAYER_ROUTING_MATRIX.md) |\n| Build guide | [docs/BUILDING.md](docs/BUILDING.md) |\n| C ABI / FFI reference | [docs/API_REFERENCE.md](docs/API_REFERENCE.md) |\n| Community benchmarks | [docs/COMMUNITY_BENCHMARKS.md](docs/COMMUNITY_BENCHMARKS.md) |\n| Architecture overview | [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) |\n| Security claims \u0026 contracts | [docs/SECURITY_CLAIMS.md](docs/SECURITY_CLAIMS.md) |\n| Secret lifecycle (zeroization, CT) | [docs/SECRET_LIFECYCLE.md](docs/SECRET_LIFECYCLE.md) |\n| Cryptographic invariants | [docs/CRYPTO_INVARIANTS.md](docs/CRYPTO_INVARIANTS.md) |\n| Thread-safety guarantees | [docs/THREAD_SAFETY.md](docs/THREAD_SAFETY.md) |\n| Safe defaults | [docs/SAFE_DEFAULTS.md](docs/SAFE_DEFAULTS.md) |\n| Differential testing | [docs/DIFFERENTIAL_TESTING.md](docs/DIFFERENTIAL_TESTING.md) |\n| Reproducible builds | [docs/REPRODUCIBLE_BUILDS.md](docs/REPRODUCIBLE_BUILDS.md) |\n| Incident response | [docs/INCIDENT_RESPONSE.md](docs/INCIDENT_RESPONSE.md) |\n| Install packages | [Installation](#installation) |\n| Why this library? | [WHY_ULTRAFASTSECP256K1.md](docs/WHY_ULTRAFASTSECP256K1.md) |\n| Cite this work | [CITATION.cff](CITATION.cff) |\n| Production adopters | [docs/ADOPTION.md](docs/ADOPTION.md) |\n| Funding \u0026 grant programmes | [docs/FUNDING_TARGETS.md](docs/FUNDING_TARGETS.md) |\n| Sponsor | [github.com/sponsors/shrec](https://github.com/sponsors/shrec) |\n\n\u003e **Claim map:** [docs/ASSURANCE_LEDGER.md](docs/ASSURANCE_LEDGER.md) · **Security policy:** [SECURITY.md](SECURITY.md) · **Discord:** [discord.gg/E4BK8SeMYU](https://discord.gg/E4BK8SeMYU)\n\n---\n\n## Review culture\n\nI welcome negative review.\n\nIf you find a real issue, please open it with a reproducer or a clear test case.\nValid findings are fixed, credited, and turned into permanent regression coverage.\n\nThe goal is not to defend the code.\nThe goal is to make the system stronger.\n\n→ Security policy: [SECURITY.md](SECURITY.md) · Exploit catalog: [docs/EXPLOIT_TEST_CATALOG.md](docs/EXPLOIT_TEST_CATALOG.md) · Residual risks: [docs/RESIDUAL_RISK_REGISTER.md](docs/RESIDUAL_RISK_REGISTER.md)\n\n---\n\n## Cite this work\n\nIf you use UltrafastSecp256k1 in academic work, please cite:\n\n- **Citation metadata** — [CITATION.cff](CITATION.cff) (also exposed via GitHub's \"Cite this repository\" button)\n- **Zenodo metadata** — [.zenodo.json](.zenodo.json)\n- **DOI (Zenodo release)** — [10.5281/zenodo.19685027](https://doi.org/10.5281/zenodo.19685027)\n\n[![DOI](https://img.shields.io/badge/DOI-10.5281%2Fzenodo.19685027-blue.svg)](https://doi.org/10.5281/zenodo.19685027)\n\n---\n\n## Why This Exists\n\nTraditional model: `code → audit PDF → trust`\n\nThis project: `code → test → execution → evidence → continuous verification`\n\nWe do not rely on trust. We provide reproducible evidence.\n\n- Every exploit attempt becomes a permanent regression test\n- Every commit runs ≈600K explicitly itemized field/scalar/point/CT assertions (plus full-suite KAT/differential/fuzz checks, not individually counted) across 149 non-exploit audit modules and 270 exploit PoCs ( 419 modules total; count via `python3 ci/sync_module_count.py`; canonical data: `docs/canonical_data.json`)\n- Every claim maps to a test in [docs/AUDIT_TRACEABILITY.md](docs/AUDIT_TRACEABILITY.md)\n- Every performance number has pinned compiler/driver/toolkit versions and raw logs\n\n\u003e If a claim cannot be traced to a test, it is not valid.\n\nFor the full breakdown of the audit culture, CI/CD pipeline, formal verification layers, and supply-chain hardening, see [WHY_ULTRAFASTSECP256K1.md](docs/WHY_ULTRAFASTSECP256K1.md).\n\n---\n\n\u003c!-- Keywords (machine-readable, SEO):\nsecp256k1 · secp256k1 python · secp256k1 nodejs · secp256k1 rust · secp256k1 go · secp256k1 c# · secp256k1 java · secp256k1 swift · ecdsa python · schnorr python · bitcoin python library · ethereum python library · ECDSA batch verify · Schnorr BIP-340 · FROST threshold signatures · MuSig2 · Bitcoin cryptography · CUDA secp256k1 · OpenCL ECC · Metal GPU crypto · BIP-352 Silent Payments · constant-time cryptography · embedded ECC · WebAssembly crypto · elliptic curve cryptography · bitcoin wallet library · taproot signatures · BIP-32 HD keys · BIP-39 mnemonic · adaptor signatures · Pedersen commitment · Bulletproof range proof · react native crypto · python ecdsa signing · node.js secp256k1 · GPU elliptic curve · wasm crypto · mobile crypto library · threshold signature scheme · multi-party computation · pip install secp256k1 · npm secp256k1 · cargo secp256k1 · nuget secp256k1 · ufsecp\n--\u003e\n\n## The Audit Model\n\nMost libraries ship fast code and trust it's correct.\nThis library ships fast code — then systematically tries to break it, on every commit, permanently.\n\n- New CVE published → PoC written → CI gate added → runs forever\n- New ePrint attack → evaluated within 1 day → permanent regression test\n- Contributor finds exploit → pull request → built into the system\n\n[How it works](#engineering-quality--self-audit-culture) · [The standard](docs/AUDIT_STANDARD.md)\n\n---\n\n## Performance Snapshot\n\nBenchmark numbers and historical milestones are maintained in [`docs/BENCHMARKS.md`](docs/BENCHMARKS.md) with pinned compiler/driver/toolkit versions, raw logs, and methodology notes.\n\n\u003e All performance claims in this README link to that document. Do not rely on inline numbers without checking the corresponding benchmark entry for hardware, batch size, and measurement conditions.\n\u003e\n\u003e Canonical raw data (GCC 14.2.0, 2026-05-30): [`docs/bench_unified_2026-05-30_gcc14_x86-64.json`](docs/bench_unified_2026-05-30_gcc14_x86-64.json)\n\n## Why UltrafastSecp256k1? — Detail\n\n\u003e TL;DR is above. This section covers what differentiates this library in depth.\n\n- **Continuous adversarial audit system** -- every exploit attempt becomes a permanent regression test; ≈600K explicitly itemized field/scalar/point/CT assertions (plus full-suite KAT/differential/fuzz checks, not individually counted) per release evidence run, 270 exploit PoCs runner modules in `unified_audit_runner.cpp` (some source files contain multiple registered test functions; all wired, verified by `ci/check_exploit_wiring.py`) across 200+ attack vectors, a block-based PR/push gate, release CAAS gate, and manual deep-assurance workflows — security hardens through executable evidence, not snapshot PDFs ([→ how it works](#engineering-quality--self-audit-culture))\n- **High-performance CPU secp256k1 engine** -- optimized generator multiply, scalar multiply, hashing, and serialization pipelines across x86-64, ARM64, RISC-V, and embedded targets ([see bench_unified ratio table](docs/BENCHMARKS.md))\n- **Built for modern secp256k1 workloads** -- signing, verification, wallet derivation, threshold protocols, adaptor signatures, ZK primitives, address generation, and large-scale public-key pipelines in one engine\n- **Dual-layer security** -- variable-time FAST path for throughput, constant-time CT path for secret-key operations\n- **Minimal dependencies** -- No runtime library dependencies for the CPU-only build (no Boost, no OpenSSL). GPU builds require CUDA toolkit, OpenCL runtime, or Metal SDK. Build requires CMake 3.18+ and a C++17 compiler (GCC 10+, Clang 12+, MSVC 2019+, arm-none-eabi, Emscripten)\n- **12+ platforms** -- x86-64, ARM64, RISC-V, WASM (experimental — CT evidence incomplete), iOS, Android, ESP32, STM32, CUDA, Metal, OpenCL, plus an early-development ROCm/HIP compatibility path slated for hardware-backed validation\n\n\u003e **The following capabilities are out of scope for the Bitcoin Core CPU backend evaluation profile:**\n\n- **Differentiated GPU secp256k1 surface** -- CUDA, OpenCL, and Metal all implement the stable 16-op GPU C ABI, while CUDA also carries the highest-throughput signing and verification kernels plus **GPU FROST partial verification** ([reproducible benchmark suite and raw logs](docs/BENCHMARKS.md))\n- **BIP-352 Silent Payments GPU pipeline** -- the full 7-stage GPU pipeline (k×P → hash → k×G → add → match) on CUDA; throughput and CPU comparison: [GPU bench](docs/BENCHMARKS.md), [standalone CPU benchmark by @craigraw](https://github.com/craigraw/bench_bip352)\n- **Field-tested GPU pipeline** -- the CUDA engine has been stress-tested in live high-throughput workflows over long-running sessions and very large point volumes, not only in short synthetic benchmarks\n- **Known production adoption** -- publicly disclosed production use includes [SparrowWallet Frigate](https://github.com/sparrowwallet/frigate), with permission to publish the adoption note from Craig Raw (adoption evidence as of 2026-03-29 — verify against current Frigate README for latest status)\n\n\u003e **Benchmark reproducibility:** All numbers come from pinned compiler/driver/toolkit versions with exact commands and raw logs. See [`docs/BENCHMARKS.md`](docs/BENCHMARKS.md) (methodology) and the [live dashboard](https://shrec.github.io/UltrafastSecp256k1/dev/bench/).\n\n\u003e **Why this library, in depth?** See [WHY_ULTRAFASTSECP256K1.md](docs/WHY_ULTRAFASTSECP256K1.md) for a full breakdown of the audit culture, block-based CI/CD pipeline, graph-assisted review model, formal verification layers, and supply-chain hardening that back these claims.\n\n\u003e **Evidence replay prep:** Run `bash ci/external_audit_prep.sh` to produce a reproducible reviewer-facing bundle with preflight outputs, assurance export, traceability artifacts, and an optional full evidence package.\n\n\u003e **Claim map:** Top-level trust claims are keyed in [docs/ASSURANCE_LEDGER.md](docs/ASSURANCE_LEDGER.md): CPU CT routing `A-001`, stable GPU ABI `A-002`, cross-backend GPU parity `A-003`, benchmark reproducibility `A-004`, exploit-audit surface `A-005`, graph-assisted review `A-006`, open self-audit transparency `A-007`, and ROCm/HIP status discipline `A-008`.\n\n**Quick links:** [Discord](https://discord.gg/E4BK8SeMYU) * [Benchmarks](docs/BENCHMARKS.md) * [Community Benchmarks](docs/COMMUNITY_BENCHMARKS.md) * [Adopters](docs/ADOPTION.md) * [Build Guide](docs/BUILDING.md) * [API Reference](docs/API_REFERENCE.md) * [Binding Usage Standard](docs/BINDINGS_USAGE_STANDARD.md) * [Security Policy](SECURITY.md) * [Threat Model](docs/THREAT_MODEL.md) * [Assurance Ledger](docs/ASSURANCE_LEDGER.md) * [AI Audit Protocol](docs/AI_AUDIT_PROTOCOL.md) * [Audit Standard (CAAS)](docs/AUDIT_STANDARD.md) * [**Why This Library?**](docs/WHY_ULTRAFASTSECP256K1.md) * [Porting Guide](docs/PORTING.md) * [**Sponsor**](https://github.com/sponsors/shrec)\n\n### Real-world Adoption\n\nUltrafastSecp256k1 is used by [Sparrow Wallet's Frigate](https://github.com/sparrowwallet/frigate).\n\nFrigate 1.4.0 switched its DuckDB extension to `ufsecp.duckdb_extension` using UltrafastSecp256k1, and its README documents a custom DuckDB extension wrapping UltrafastSecp256k1 for `ufsecp_scan(...)`-based Silent Payments scanning with CUDA, OpenCL and Metal backend support. *(as of Frigate 1.4.0, 2026-03-29 — verify against current Frigate README for latest status)*\n\nSee: [Frigate 1.4.0 release](https://github.com/sparrowwallet/frigate/releases/tag/1.4.0) · [Frigate README](https://github.com/sparrowwallet/frigate/blob/master/README.md) · [Details →](docs/ADOPTION.md)\n\nPackage traction *(snapshot 2026-03-29 — see linked package pages for current figures)*: [`ufsecp`](https://www.npmjs.com/package/ufsecp) 1,192 npm downloads/30d · [`react-native-ufsecp`](https://www.npmjs.com/package/react-native-ufsecp) 1,295/30d · [`Ufsecp`](https://www.nuget.org/packages/Ufsecp) 1,491 NuGet total. *(See [docs/ADOPTION.md](docs/ADOPTION.md) for full adoption evidence.)*\n\nFull adopter list: [ADOPTERS.md](docs/ADOPTERS.md)\n\n---\n\n[![GitHub stars](https://img.shields.io/github/stars/shrec/UltrafastSecp256k1?style=flat-square\u0026logo=github\u0026label=Stars)](https://github.com/shrec/UltrafastSecp256k1/stargazers)\n[![GitHub forks](https://img.shields.io/github/forks/shrec/UltrafastSecp256k1?style=flat-square\u0026logo=github\u0026label=Forks)](https://github.com/shrec/UltrafastSecp256k1/network/members)\n[![Gate](https://img.shields.io/github/actions/workflow/status/shrec/UltrafastSecp256k1/gate.yml?branch=dev\u0026label=Gate)](https://github.com/shrec/UltrafastSecp256k1/actions/workflows/gate.yml)\n[![Research](https://img.shields.io/github/actions/workflow/status/shrec/UltrafastSecp256k1/research-monitor.yml?branch=dev\u0026label=Research)](https://github.com/shrec/UltrafastSecp256k1/actions/workflows/research-monitor.yml)\n[![Release](https://img.shields.io/github/v/release/shrec/UltrafastSecp256k1?label=Release)](https://github.com/shrec/UltrafastSecp256k1/releases/latest)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![C++17+](https://img.shields.io/badge/C%2B%2B-17%2B-blue.svg)](https://en.cppreference.com/w/cpp/17)\n[![OpenSSF Scorecard](https://api.scorecard.dev/projects/github.com/shrec/UltrafastSecp256k1/badge)](https://scorecard.dev/viewer/?uri=github.com/shrec/UltrafastSecp256k1)\n[![OpenSSF Best Practices](https://www.bestpractices.dev/projects/12011/badge)](https://www.bestpractices.dev/projects/12011)\n[![SonarCloud](https://sonarcloud.io/api/project_badges/measure?project=shrec_UltrafastSecp256k1\u0026metric=security_rating)](https://sonarcloud.io/summary/overall?id=shrec_UltrafastSecp256k1)\n[![Discord](https://img.shields.io/badge/Discord-Join%20Us-5865F2?logo=discord\u0026logoColor=white)](https://discord.gg/E4BK8SeMYU)\n\n**Supported Blockchains (secp256k1-based):**\n\n[![Bitcoin](https://img.shields.io/badge/Bitcoin-BTC-F7931A.svg?logo=bitcoin\u0026logoColor=white)](https://bitcoin.org)\n[![Ethereum](https://img.shields.io/badge/Ethereum-ETH-3C3C3D.svg?logo=ethereum\u0026logoColor=white)](https://ethereum.org)\n[![Litecoin](https://img.shields.io/badge/Litecoin-LTC-A6A9AA.svg?logo=litecoin\u0026logoColor=white)](https://litecoin.org)\n[![Dogecoin](https://img.shields.io/badge/Dogecoin-DOGE-C2A633.svg?logo=dogecoin\u0026logoColor=white)](https://dogecoin.com)\n[![Bitcoin Cash](https://img.shields.io/badge/Bitcoin%20Cash-BCH-8DC351.svg?logo=bitcoincash\u0026logoColor=white)](https://bitcoincash.org)\n[![Zcash](https://img.shields.io/badge/Zcash-ZEC-F4B728.svg)](https://z.cash)\n[![Dash](https://img.shields.io/badge/Dash-DASH-008CE7.svg?logo=dash\u0026logoColor=white)](https://dash.org)\n[![BNB Chain](https://img.shields.io/badge/BNB%20Chain-BNB-F0B90B.svg?logo=binance\u0026logoColor=white)](https://www.bnbchain.org)\n[![Polygon](https://img.shields.io/badge/Polygon-MATIC-8247E5.svg?logo=polygon\u0026logoColor=white)](https://polygon.technology)\n[![Avalanche](https://img.shields.io/badge/Avalanche-AVAX-E84142.svg?logo=avalanche\u0026logoColor=white)](https://avax.network)\n[![Arbitrum](https://img.shields.io/badge/Arbitrum-ARB-28A0F0.svg)](https://arbitrum.io)\n[![Optimism](https://img.shields.io/badge/Optimism-OP-FF0420.svg)](https://optimism.io)\n[![+15 more](https://img.shields.io/badge/+15%20more-secp256k1%20coins-grey.svg)](#secp256k1-supported-coins-27-blockchains)\n\n**GPU \u0026 Platform Support:**\n\n[![CUDA](https://img.shields.io/badge/CUDA-12.0+-green.svg)](https://developer.nvidia.com/cuda-toolkit)\n[![OpenCL](https://img.shields.io/badge/OpenCL-3.0-green.svg)](https://www.khronos.org/opencl/)\n[![Apple Silicon](https://img.shields.io/badge/Apple%20Silicon-M1%2FM2%2FM3%2FM4-black.svg?logo=apple)](src/metal/)\n[![Metal](https://img.shields.io/badge/Metal-GPU%20Compute-silver.svg?logo=apple)](src/metal/)\n[![ROCm](https://img.shields.io/badge/ROCm-6.3%20HIP-red.svg)](src/cuda/README.md)\n[![WebAssembly](https://img.shields.io/badge/WebAssembly-Emscripten-purple.svg)](bindings/wasm/)\n[![ARM64](https://img.shields.io/badge/ARM64-Cortex--A55%2FA76-orange.svg)](https://developer.android.com/ndk)\n[![RISC-V](https://img.shields.io/badge/RISC--V-RV64GC-orange.svg)](https://riscv.org/)\n[![Android](https://img.shields.io/badge/Android-NDK%20r27-brightgreen.svg)](bindings/android/)\n[![iOS](https://img.shields.io/badge/iOS-17%2B%20XCFramework-lightgrey.svg)](cmake/ios.toolchain.cmake)\n[![ESP32-S3](https://img.shields.io/badge/ESP32--S3-Xtensa%20LX7-orange.svg)](https://www.espressif.com/en/products/socs/esp32-s3)\n[![ESP32](https://img.shields.io/badge/ESP32-Xtensa%20LX6-orange.svg)](https://www.espressif.com/en/products/socs/esp32)\n[![ESP32-C6](https://img.shields.io/badge/ESP32--C6-RISC--V%20RV32-orange.svg)](https://www.espressif.com/en/products/socs/esp32-c6)\n[![ESP32-P4](https://img.shields.io/badge/ESP32--P4-RISC--V%20HP-orange.svg)](https://www.espressif.com/en/products/socs/esp32-p4)\n[![STM32](https://img.shields.io/badge/STM32-Cortex--M3-orange.svg)](https://www.st.com/en/microcontrollers-microprocessors/stm32f103ze.html)\n\n---\n\n## Highlights\n\n- **BIP-352 GPU pipeline** -- full silent payment scanning pipeline on CUDA; benchmark and CPU comparison in [docs/BENCHMARKS.md](docs/BENCHMARKS.md)\n- **GPU-accelerated secp256k1** -- high-throughput CUDA verification kernels, batch ECDH, BIP-352 scanning, and BIP-324 encryption on CUDA/OpenCL/Metal; CT-sensitive signing always routes through the CPU CT layer; GPU operations that handle secret material (ECDH, BIP-352, BIP-324) require a trusted single-tenant environment (see [GPU Security Model](docs/BACKEND_ASSURANCE_MATRIX.md))\n- **GPU C ABI (`ufsecp_gpu`)** -- stable 16-op FFI for GPU batch ops across CUDA, OpenCL, and Metal, with full backend parity on the public surface\n- **Zero-Knowledge cryptographic layer** -- Pedersen commitments, DLEQ proofs, Bulletproof range proofs, Ethereum-compatible Keccak-256\n- **Batch operations** -- all-affine Pippenger with touched-bucket optimization; see [docs/BENCHMARKS.md](docs/BENCHMARKS.md) for measured throughput\n- **Multi-language bindings** -- Python (`pip install ufsecp`), Node.js (`npm i ufsecp`), Rust, Go, C#/.NET, Java, Swift, PHP, Ruby, Dart, React Native — all via the stable C ABI\n- **Embedded device support** -- ESP32-S3, ESP32-P4, ESP32-C6, STM32 Cortex-M\n- **Zero-dependency portable core** -- no Boost, no OpenSSL for the CPU-only build; GPU builds require CUDA toolkit, OpenCL runtime, or Metal SDK; compiles anywhere from server-class GPUs to bare-metal microcontrollers\n- **Massively parallel workloads** -- batch verification, key scanning, address generation at GPU scale\n\n---\n\n## Engineering Quality \u0026 Self-Audit Culture\n\n\u003e Most high-performance cryptographic libraries ship fast code and trust that it is correct.\n\u003e UltrafastSecp256k1 ships fast code **and then systematically tries to break it**.\n\u003e The internal self-audit system was designed in parallel with the cryptographic implementation as a first-class engineering artifact — not bolted on afterwards.\n\nThe governing idea is Bitcoin-style: **don't trust, verify**. The project does not treat assurance as a PDF milestone that must be waited on before the next improvement. Instead, it treats auditability as an always-on property of the repository: reproducible builds, rerunnable tests, structured artifacts, graph-backed code navigation, and continuous adversarial review that anyone can repeat.\n\nThis top-level narrative maps directly to the assurance ledger: CT secret-key routing (`A-001`), exploit-style audit coverage (`A-005`), graph-assisted review (`A-006`), and self-audit transparency (`A-007`).\n\n### By the Numbers\n\n| Metric | Value |\n|--------|-------|\n| Internal audit assertions per build | **≈600K explicitly itemized** field/scalar/point/CT (see [WHY_ULTRAFASTSECP256K1.md](docs/WHY_ULTRAFASTSECP256K1.md)), plus full-suite KAT/differential/fuzz checks (not individually counted) |\n| Audit modules (`unified_audit_runner`) | **149 non-exploit modules + 270 exploit PoCs across 10 sections, 0 mandatory failures** (see [docs/AUDIT_COVERAGE.md](docs/AUDIT_COVERAGE.md) for advisory cluster status) |\n| Exploit PoC test files | **270 exploit-PoC modules (258 source files), 20+ coverage areas, 0 mandatory failures** |\n| CI/CD workflows | **50+ GitHub Actions workflows** |\n| Build matrix (arch × config × OS) | **7 × 17 × 5 = 595 theoretical combinations** (actual CI matrix is a subset — see `.github/workflows/` for exact matrix) |\n| Differential tests (per push + manual) | **~1,300,000+ checks per deep-assurance run** |\n| Constant-time verification pipelines | **5 independent: 3 available as GitHub Actions workflows (`ct-verif.yml`, `valgrind-ct.yml`, `ct-prover.yml`) — triggered manually or on release tag push, not on every commit push; 2 manual/local: dudect statistical, ARM64 native** |\n| Fuzzing adversarial corpus | **libFuzzer + ClusterFuzz-Lite (see `.clusterfuzzlite/` and `src/cpu/fuzz/`; corpus count grows with CI runs and is not stored in-repo)** |\n| Static analysis tools | **7 (CodeQL, Clang-Tidy, CPPCheck, SonarCloud, Semgrep, Infer, Clang-SA)** |\n| Self-audit documents in repo | see [`docs/`](docs/) directory |\n| Self-tests passing (all backends) | **76/76** (reproduce: `./out/release/selftest` → \"ALL TESTS PASSED\"; observed on the 4.1.0 release build) |\n\n### CI/CD Pipeline Highlights\n\n| Workflow | Purpose | Trigger |\n|----------|---------|---------|\n| `gate.yml` | Block-based PR/push gate: impact detection, fast CAAS gates, selected profile checks, final verdict | Push / PR |\n| `release.yml` | Release CAAS gate before build/package fan-out, then full release packaging | Tags / manual |\n| `research-monitor.yml` | External research/CVE/paper intake; opens issues only for high-confidence signals | Scheduled / manual |\n| Manual deep-assurance workflows | CT-Verif, Valgrind CT, sanitizers, fuzzing, mutation, benchmarks, GPU, CodeQL, Scorecard | Manual / release policy |\n\n### What \"Self-Audit Culture\" Means in Practice\n\n- Every field arithmetic property is verified algebraically: commutativity, associativity, distributivity, carry propagation, canonical form\n- Every constant-time path is verified under **5 independent pipelines: LLVM ct-verif (`ct-verif.yml`), Valgrind taint (`valgrind-ct.yml`), ct-prover/sPIN (`ct-prover.yml`) — available as GitHub Actions workflows, triggered manually or on release tag push; dudect (statistical) and ARM64 native run locally/manually**\n- Every ECDSA/Schnorr implementation is cross-validated against **Wycheproof vectors, independent reference golden vectors, and BIP test vectors**\n- Performance evidence is tracked through manual/release deep-assurance workflows instead of every-push benchmark fan-out\n- Audit results are logged as **structured artifacts** (JSON reports, per-platform logs), not just pass/fail signals\n- Differential tests run on every push and via manual deep-assurance workflows; no separate nightly schedule\n- All 149 non-exploit audit modules and all 270 exploit PoCs return `AUDIT-READY (self-generated)` status as of the last CAAS gate run. Zero failures — see pinned evidence: [`docs/EXTERNAL_AUDIT_BUNDLE.json`](docs/EXTERNAL_AUDIT_BUNDLE.json).\n\n### Exploit PoC Test Suite (270 Tests, 20+ Coverage Areas)\n\nIn addition to the 419-module `unified_audit_runner`, UltrafastSecp256k1 ships **270 exploit-style PoC modules files** that actively try to break the library across its highest-risk surfaces. Each `audit/test_exploit_*.cpp` target builds and runs standalone so failures stay easy to attribute and reproduce.\n\n| Coverage Area | Representative attack focus |\n|---------------|-----------------------------|\n| ECDSA / Signature | malleability, RFC 6979 KATs, recovery edge cases |\n| Schnorr / BIP-340 / Batch | batch soundness, forged signatures, invalid identification paths |\n| GLV / ECC Math | endomorphism invariants, multiscalar correctness, Pippenger behavior |\n| BIP-32 / BIP-39 / HD Keys | path overflow, hardened isolation, mnemonic and derivation edge cases |\n| MuSig2 / FROST | nonce reuse, transcript fork equivocation, stale commitment replay, rogue-key aggregation, Byzantine participants, DKG and Lagrange edge cases |\n| Adaptor Signatures / ZK | adaptor parity attacks, Pedersen invariants, malformed ZK proofs |\n| Crypto Primitives / AEAD | ChaCha20-Poly1305 integrity, HKDF, SHA/Keccak/RIPEMD KATs |\n| ECIES | authentication forgery, encryption correctness, roundtrip safety |\n| Bitcoin / Protocol BIPs | BIP-143, BIP-144, BIP-324, SegWit, Taproot protocol edge cases |\n| Address / Wallet / Signing | address encoding, wallet API misuse, Ethereum and Bitcoin signing flows |\n| Constant-Time / Security | CT divergence, key-recovery style probes, backend divergence detection |\n| ElligatorSwift | encoding correctness and ECDH roundtrips |\n| Self-Test / Recovery | self-test API behavior and recovery boundary cases |\n| Batch Verify | aggregate verification math correctness |\n\n\u003e All 270 registered exploit-PoC modules live in `audit/test_exploit_*.cpp` (258 source files; some files register multiple modules). Build with `python3 ci/configure_build.py audit` (or `cmake -S . -B out/audit -G Ninja -DCMAKE_BUILD_TYPE=Release`) and run them standalone or via `ctest`.\n\n### Self-Audit Document Index\n\n| Document | Contents |\n|----------|---------|\n| [WHY_ULTRAFASTSECP256K1.md](docs/WHY_ULTRAFASTSECP256K1.md) | Full audit infrastructure, CI pipeline index, formal verification evidence |\n| [docs/AUDIT_PHILOSOPHY.md](docs/AUDIT_PHILOSOPHY.md) | Audit philosophy, continuous evidence model, design rationale, common objections answered |\n| [AUDIT_REPORT.md](docs/AUDIT_REPORT.md) | Historical baseline audit (641,194 core checks). Live module count comes from `docs/canonical_data.json` (regenerated from `audit/unified_audit_runner.cpp` ALL_MODULES[]) |\n| [AUDIT_COVERAGE.md](docs/AUDIT_COVERAGE.md) | Per-module coverage matrix |\n| [THREAT_MODEL.md](docs/THREAT_MODEL.md) | Layer-by-layer risk analysis |\n| [SECURITY.md](SECURITY.md) | Vulnerability disclosure policy |\n| [docs/AUDIT_GUIDE.md](docs/AUDIT_GUIDE.md) | Navigation guide for independent reviewers |\n| [docs/CI_ENFORCEMENT.md](docs/CI_ENFORCEMENT.md) | Full CI enforcement policy |\n| [docs/BACKEND_ASSURANCE_MATRIX.md](docs/BACKEND_ASSURANCE_MATRIX.md) | Per-backend assurance matrix |\n| [docs/AUDIT_TRACEABILITY.md](docs/AUDIT_TRACEABILITY.md) | Requirement-to-test traceability map |\n\n\u003e The assurance model is open self-audit: reproducible tests, traceability, CI enforcement, and public review artifacts that anyone can rerun.\n\u003e The project hardens continuously through internal audit on every build and every commit.\n\n---\n\n## Performance\n\n\u003cdetails\u003e\n\u003csummary\u003eGPU Performance (diagnostic — out of scope for Bitcoin Core backend evaluation)\u003c/summary\u003e\n\n\u003e GPU throughput numbers are intentionally not published in this README.\n\u003e\n\u003e **Why:** CLAUDE.md ABSOLUTE rule — every benchmark number must come from a measurement on the current machine and current binary; \"diagnostic\" or \"not verified against current build\" annotations on concrete numbers violate that rule even with a label. The previous RTX 5060 Ti table mixed live diagnostic figures with explicitly-stale figures; replacing both with this single pointer keeps the README honest.\n\u003e\n\u003e **Where to find the current numbers if you need them:**\n\u003e\n\u003e 1. Build and run `bench_unified --gpu` on your own hardware (the binary covers the same kernel surface the old table reported on).\n\u003e 2. The benchmark methodology is documented in `docs/BENCHMARKS.md` (CPU section); the same controlled-run discipline applies to GPU runs (CPU pinning is not relevant for GPU, but turbo-state and PCIe-state pinning are — see the `--gpu-info` flag).\n\u003e 3. Canonical bench artifacts live under `docs/bench_unified_*.json`; no GPU artifact is currently checked in because no controlled GPU run has been committed since the v4.0.0 baseline. When one is, this section will be updated to reference it (and `canonical_numbers.json` will carry the ratios via the same sync pipeline that handles CPU numbers).\n\u003e\n\u003e GPU correctness coverage IS published — see the `BACKEND_ASSURANCE_MATRIX.md` for the CT-clean status of each kernel, and the unified runner's `gpu-*` advisory modules for kernel-level invariant checks.\n\n\u003c/details\u003e\n\n## Architecture\n\n```\n+-------------------------------------------------------+\n|              Language Bindings (FFI)                   |\n|  Python | Node | Rust | Go | C# | Java | Swift | PHP |\n+-------------------------------------------------------+\n                         |\n                  Bindings Layer\n                 (ctypes / koffi / cgo\n                  JNA / P/Invoke / FFI)\n                         |\n+-------------------------------------------------------+\n|          UltrafastSecp256k1 Core (C++20)               |\n|                                                       |\n|  ECDSA | Schnorr | ECDH | MuSig2 | FROST | Pedersen  |\n|  Taproot | BIP-32 HD | Adaptor Sigs | ZK Proofs       |\n|  [FAST layer]              [CT layer]                 |\n+-------------------------------------------------------+\n                         |\n+--------+---------+---------+---------+----------------+\n|  CPU   |  CUDA   | OpenCL  |  Metal  |   Embedded     |\n| x86_64 | NVIDIA  | AMD/NV  |  Apple  | ESP32 / STM32  |\n| ARM64  | sm_50+  | any GPU | Silicon | RISC-V / WASM  |\n| RISC-V |         |         |         | Cortex-M       |\n+--------+---------+---------+---------+----------------+\n```\n\n## Examples\n\n| Category | Description | Link |\n|----------|-------------|------|\n| **CPU** | Core ECC, ECDSA, Schnorr, BIP-32, Taproot, Pedersen | [examples/](examples/) |\n| **CUDA** | GPU benchmark signing kernels, batch verify, FROST, device management (production secret-key signing uses CPU CT layer) | [examples/](examples/) |\n| **OpenCL** | Cross-vendor GPU compute | [examples/](examples/) |\n| **Metal** | Apple Silicon GPU acceleration | [examples/](examples/) |\n| **Multi-language** | C, Python, Rust, Node.js, Go, Java binding examples | [examples/README.md](examples/README.md) |\n| **Embedded** | ESP32-S3, STM32 platform ports | [examples/esp32_test/](examples/esp32_test/) |\n\n## Use Cases\n\n- **Blockchain infrastructure** -- high-throughput transaction validation and signing pipelines (secret-key signing runs on the CPU CT layer; batch verification scales on GPU)\n- **Signature verification at scale** -- batch verify millions of signatures per second on GPU\n- **Cryptographic research** -- independent secp256k1 implementation with full source access\n- **Zero-knowledge pipelines** -- Pedersen commitments, Bulletproofs, DLEQ proofs\n- **Embedded cryptographic systems** -- hardware wallets, IoT devices, microcontrollers\n- **Key scanning \u0026 address generation** -- BIP-352 Silent Payments, vanity address mining\n\n\u003e Star the repository if you find it useful!\n\n---\n\n## Security \u0026 Vulnerability Reporting\n\n**Report vulnerabilities** via [GitHub Security Advisories](https://github.com/shrec/UltrafastSecp256k1/security/advisories/new) or email [payysoon@gmail.com](mailto:payysoon@gmail.com).\nFor production cryptographic systems, perform your own risk review, review the current guarantees in [SUPPORTED_GUARANTEES.md](include/ufsecp/SUPPORTED_GUARANTEES.md), and apply the assurance level appropriate to your deployment.\n\nFor the full audit infrastructure breakdown (≈600K itemized assertions, block-based CAAS gates, formal CT verification pipelines, self-audit document index), see the [Engineering Quality \u0026 Self-Audit Culture](#engineering-quality--self-audit-culture) section above and [WHY_ULTRAFASTSECP256K1.md](docs/WHY_ULTRAFASTSECP256K1.md).\n\n\u003e **Sponsors / funding partners:** see the \"Support the Project\" section at the bottom of this README.\n\n---\n\n## secp256k1 Feature Overview\n\nFeatures are organized into **maturity tiers** (see [SUPPORTED_GUARANTEES.md](include/ufsecp/SUPPORTED_GUARANTEES.md) for detailed guarantees):\n\n| Tier | Category | Component | Status |\n|------|----------|-----------|--------|\n| **1 -- Core** | Field / Scalar / Point | GLV, Precompute, Batch Inverse | [OK] |\n| **1 -- Core** | Assembly | x64 MASM/GAS, BMI2/ADX, ARM64, RISC-V RV64GC | [OK] |\n| **1 -- Core** | SIMD | AVX2/AVX-512 batch ops, Montgomery batch inverse | [OK] |\n| **1 -- Core** | Constant-Time | CT field/scalar/point -- no secret-dependent branches | [OK] |\n| **1 -- Core** | ECDSA | Sign/Verify, RFC 6979, DER/Compact, low-S, Recovery | [OK] |\n| **1 -- Core** | Schnorr | BIP-340 sign/verify, tagged hashing, x-only pubkeys | [OK] |\n| **1 -- Core** | ECDH | Key exchange (raw, xonly, SHA-256) | [OK] |\n| **1 -- Core** | Multi-scalar | Strauss/Shamir dual-scalar multiplication | [OK] |\n| **1 -- Core** | Batch verify | ECDSA + Schnorr batch verification | [OK] |\n| **1 -- Core** | Hashing | SHA-256 (SHA-NI), SHA-512, HMAC, Keccak-256 | [OK] |\n| **1 -- Core** | C ABI | `ufsecp` stable FFI (45 exports) | [OK] |\n| **2 -- Protocol** | BIP-32/44 | HD derivation, path parsing, xprv/xpub, coin-type | [OK] |\n| **2 -- Protocol** | Taproot | BIP-341/342, tweak, Merkle tree | [OK] |\n| **2 -- Protocol** | MuSig2 | BIP-327, key aggregation, 2-round signing | [EXPERIMENTAL] |\n| **2 -- Protocol** | FROST | Threshold signatures, t-of-n | [EXPERIMENTAL] |\n| **2 -- Protocol** | Adaptor | Schnorr + ECDSA adaptor signatures | [OK] |\n| **2 -- Protocol** | Pedersen | Commitments, homomorphic, switch commitments | [OK] |\n| **2 -- Protocol** | ZK Proofs | Schnorr sigma, DLEQ, Bulletproof range proofs (64-bit) | [OK] |\n| **3 -- Convenience** | Address | P2PKH, P2WPKH, P2TR, Base58, Bech32/m, EIP-55 | [OK] |\n| **3 -- Convenience** | Coins | 27 blockchains, auto-dispatch | [OK] |\n| **2 -- Protocol** | BIP-352 | Silent Payments scanning pipeline (CPU + GPU) | [OK] |\n| **2 -- Protocol** | ECIES | Elliptic curve integrated encryption | [OK] |\n| -- | GPU | CUDA, Metal, OpenCL kernels | [OK] · ROCm [EXPERIMENTAL] |\n| -- | GPU C ABI | `ufsecp_gpu` -- 19 functions (16 batch ops + 3 lifecycle: ctx/device/error), 3 backends, incl. FROST, BIP-324, BIP-352 | [OK] |\n| -- | Platforms | x64, ARM64, RISC-V, ESP32, STM32, WASM, iOS, Android | [OK] |\n\n\u003e **Tier 1** = battle-tested core crypto with stable API. **Tier 2** = protocol-level features, API may evolve. **Tier 3** = convenience utilities.\n\n### BIP-340 Strict Encoding\n\nAll public API functions enforce **canonical input encoding** as required by BIP-340 and Bitcoin consensus:\n- Signatures with `r \u003e= p` or `s \u003e= n` are **rejected, not reduced**\n- Public keys with `x \u003e= p` are **rejected, not reduced**\n- Private keys must satisfy `1 \u003c= sk \u003c n`\n\nThe C ABI (`ufsecp_*`) returns distinct error codes: `UFSECP_ERR_BAD_SIG` (non-canonical signature) vs `UFSECP_ERR_VERIFY_FAIL` (valid encoding, bad math). See [docs/COMPATIBILITY.md](docs/COMPATIBILITY.md) for details.\n\n---\n\n## BIP-352 Silent Payments Scanning Benchmark\n\n### GPU Pipeline (CUDA, RTX 5060 Ti)\n\nThe full 7-stage BIP-352 scanning pipeline runs entirely on-GPU with zero CPU round-trips:\n\n1. **k×P** -- scalar multiply tweak point by scan private key\n2. **Serialize** -- compress shared secret to 33-byte SEC1\n3. **Tagged SHA-256** -- `BIP0352/SharedSecret` tagged hash\n4. **k×G** -- generator multiply by hash scalar\n5. **Point add** -- `spend_pubkey + output_point`\n6. **Serialize + prefix** -- compress candidate, extract upper 64 bits\n7. **Prefix match** -- compare against output prefix list\n\n| Mode | ns/op | Throughput | Notes |\n|------|-------|------------|-------|\n| GPU pipeline (GLV, w=4) | 179.2 ns | 5.58 M/s | GLV wNAF decomposition |\n| **GPU pipeline (LUT)** | **91.0 ns** | **11.00 M/s** | 64 MB precomputed 16×64K generator table |\n| GPU pipeline (LUT + pretbl) | 102.1 ns | ~9.79 M/s | Precomputed per-tweak tables |\n\n*500K tweak points per batch, 11 passes, median. Near-optimal occupancy for RTX 5060 Ti (SM 12.0, 36 SMs). ~950 billion candidates/day.*\n\n### GPU vs CPU Comparison\n\n| Platform | Full Pipeline | vs GPU (LUT) |\n|----------|--------------|-------|\n| **CUDA GPU (RTX 5060 Ti)** | **91.0 ns/op** | **baseline** |\n| x86-64 CPU (i5-14400F, GCC 14) | 24,285 ns/op | 267× slower |\n| ARM64 CPU (Cortex-A55, Clang 18) | 153,385 ns/op | 1,644× slower |\n| RISC-V 64 (SiFive U74, GCC 13) | 257,996 ns/op | 2,765× slower |\n\n### Community \u0026 Contributor Benchmarks\n\nSee **[docs/COMMUNITY_BENCHMARKS.md](docs/COMMUNITY_BENCHMARKS.md)** for all hardware results submitted by community members — including RTX 5070 Ti (Blackwell) and a standalone BIP-352 CPU comparison vs libsecp256k1.  Want to add yours? Instructions are in that file.\n\n### Real-world scanning performance (Frigate / Sparrow Wallet)\n\nIndependent benchmarks from [Sparrow Wallet's Frigate](https://github.com/sparrowwallet/frigate) — a DuckDB-based Silent Payments scanning pipeline using UltrafastSecp256k1 via [`ufsecp_scan(...)`](https://github.com/sparrowwallet/duckdb-ufsecp-extension). Results produced by Frigate's `benchmark.py` scanning mainnet to block 914,000.\n\n**GPU scanning (full BIP-352 pipeline, 2-year scan, 133M tweaks):**\n\n| Hardware | Backend | Time | Throughput |\n|----------|---------|------|------------|\n| 2× NVIDIA RTX 5090 | CUDA | 3.2 s | ~41.5 M/s |\n| NVIDIA RTX 5080 | CUDA | 7.7 s | ~17.3 M/s |\n| Apple M1 Pro | Metal | 3m 47s | ~584 K/s |\n\n**CPU scanning (full BIP-352 pipeline, 2-year scan, 133M tweaks):**\n\n| Hardware | CPUs | Time | Throughput |\n|----------|------|------|------------|\n| Intel Core Ultra 9 285K | 24 | 3m 50s | ~577 K/s |\n| Apple M1 Pro | 10 | 7m 47s | ~284 K/s |\n\nSource: [Frigate README — Performance](https://github.com/sparrowwallet/frigate/blob/master/README.md#performance)\n\n### CPU vs libsecp256k1 (standalone external benchmark)\n\nStandalone single-threaded benchmark by [@craigraw](https://github.com/craigraw) ([bench_bip352](https://github.com/craigraw/bench_bip352)) — full results in [docs/COMMUNITY_BENCHMARKS.md](docs/COMMUNITY_BENCHMARKS.md). Thank you for the contribution!\n\n**Full pipeline** (10K points, 11 passes, median, GCC 12.4, `-O3 -march=native`, `USE_ASM_X86_64=1`):\n\n| Backend | Median | ns/op | Ratio |\n|---------|--------|-------|-------|\n| libsecp256k1 | 545.2 ms | 54,519 ns | 1.00x |\n| **UltrafastSecp256k1** | **456.1 ms** | **45,615 ns** | **1.20x faster** |\n\n**Per-operation breakdown** (1K points, 11 passes, median):\n\n| Operation | libsecp256k1 | UltrafastSecp256k1 | Ratio |\n|-----------|-------------|-------------------|-------|\n| k\\*P (scalar mul) | 37,975 ns | 26,460 ns | 1.44x faster |\n| Serialize compressed (1st) | 36 ns | 15 ns | 2.4x faster |\n| Tagged SHA-256 ‡ | 744 ns | 65 ns | 11.4x faster (diagnostic) |\n| k\\*G (generator mul) | 17,460 ns | 8,559 ns | 2.04x faster |\n| Point addition | 2,250 ns | 2,457 ns | 0.92x |\n| Serialize compressed (2nd) | 23 ns | 21 ns | 1.1x faster |\n\n\u003e **Note:** Point addition is slightly slower because both inputs have Z=1 (affine), so UltrafastSecp256k1 uses direct affine addition with a field inversion to return an affine result -- this eliminates the separate inversion in serialization.\n\u003e\n\u003e **‡ Tagged SHA-256 — diagnostic only:** this ratio is environment-dependent. libsecp256k1's SHA-256 throughput depends on whether the comparison build enables SHA-NI / hardware SHA extensions and its compiler flags; on a SHA-NI-enabled libsecp build the gap narrows substantially. Treat this row as a diagnostic of our tagged-hash path, not a portable \"faster than libsecp\" claim.\n\n---\n\n## 60-Second Quickstart\n\nGet a working selftest in under a minute:\n\n**Option A -- Linux (apt)**\n```bash\nsudo apt install libufsecp4\nufsecp_selftest          # Expected: \"OK (version 4.1.x, backend CPU)\"\n```\n\n**Option B -- npm (any OS)**\n```bash\nnpm i ufsecp\nnode -e \"require('ufsecp').selftest()\"   # Expected: \"OK\"\n```\n\n**Option C -- Python (any OS)**\n```bash\npip install ufsecp\npython -c \"import ufsecp; ufsecp.selftest()\"  # Expected: \"OK\"\n```\n\n**Option D -- Build from source**\n```bash\ngit clone https://github.com/shrec/UltrafastSecp256k1.git \u0026\u0026 cd UltrafastSecp256k1\n\n# Recommended: canonical build under out/release\npython3 ci/configure_build.py release\ncmake --build out/release -j\n\n# Or classic one-liner:\ncmake -S . -B out/release -G Ninja -DCMAKE_BUILD_TYPE=Release \u0026\u0026 cmake --build out/release -j\n./out/release/selftest    # Expected: \"ALL TESTS PASSED\"\n```\n\n---\n\n## Platform Support Matrix\n\n| Target | Backend | Install / Entry Point | Status |\n|--------|---------|----------------------|--------|\n| **Linux x64** | CPU | `apt install libufsecp4` | [OK] Stable |\n| **Windows x64** | CPU | NuGet `UltrafastSecp256k1` / [Release .zip](https://github.com/shrec/UltrafastSecp256k1/releases) | [OK] Stable |\n| **macOS (x64/ARM64)** | CPU + Metal | `brew install ufsecp` / build from source | [OK] Stable |\n| **Android ARM64** | CPU | `implementation 'io.github.shrec:ufsecp'` (Maven) | [OK] Stable |\n| **iOS ARM64** | CPU | Swift Package / CocoaPods / XCFramework | [OK] Stable — **⚠️ SPM/CocoaPods builds have CT guards disabled (verification only)** |\n| **Browser / Node.js** | WASM | `npm i ufsecp` | [~] Experimental — CT evidence incomplete |\n| **ESP32-S3 / ESP32** | CPU | PlatformIO / IDF component | [OK] Tested |\n| **ESP32-C6** | CPU (RISC-V RV32) | PlatformIO / IDF component | [OK] Tested |\n| **ESP32-P4** | CPU (RISC-V HP dual-core) | PlatformIO / IDF component | [OK] Tested |\n| **STM32 (Cortex-M)** | CPU | CMake cross-compile | [OK] Tested |\n| **NVIDIA GPU** | CUDA 12+ | Build with `-DSECP256K1_BUILD_CUDA=ON` | [OK] Stable |\n| **AMD GPU** | ROCm/HIP | Build with `-DSECP256K1_BUILD_ROCM=ON` | [!] Beta |\n| **Apple GPU** | Metal | Build with Metal backend | [..] Experimental (discovery only) |\n| **Any GPU** | OpenCL | Build with `-DSECP256K1_BUILD_OPENCL=ON` | [OK] Full (6/6 ops) |\n| **RISC-V (RV64GC)** | CPU | Cross-compile | [OK] Tested |\n\n---\n\n## Installation\n\n### Linux (APT -- Debian / Ubuntu)\n\n```bash\n# Add repository\ncurl -fsSL https://shrec.github.io/UltrafastSecp256k1/apt/KEY.gpg | sudo gpg --dearmor -o /etc/apt/keyrings/ultrafastsecp256k1.gpg\necho \"deb [signed-by=/etc/apt/keyrings/ultrafastsecp256k1.gpg] https://shrec.github.io/UltrafastSecp256k1/apt stable main\" \\\n  | sudo tee /etc/apt/sources.list.d/ultrafastsecp256k1.list\nsudo apt update\n\n# Install (runtime only)\nsudo apt install libufsecp4\n\n# Install (development -- headers, static lib, cmake/pkgconfig)\nsudo apt install libufsecp-dev\n```\n\n### Linux (RPM -- Fedora / RHEL)\n\n```bash\n# Download from GitHub Releases\ncurl -LO https://github.com/shrec/UltrafastSecp256k1/releases/latest/download/UltrafastSecp256k1-*.rpm\nsudo dnf install ./UltrafastSecp256k1-*.rpm\n```\n\n### Arch Linux (AUR)\n\n```bash\n# Using yay\nyay -S libufsecp\n\n# Or manually\ngit clone https://aur.archlinux.org/libufsecp.git\ncd libufsecp \u0026\u0026 makepkg -si\n```\n\n### From source (any platform)\n\n```bash\n# For development/testing, use out/release instead of the bare 'build' dir:\n# python3 ci/configure_build.py release\ncmake -S . -B out/release -G Ninja \\\n    -DCMAKE_BUILD_TYPE=Release \\\n    -DCMAKE_INSTALL_PREFIX=/usr \\\n    -DSECP256K1_BUILD_SHARED=ON \\\n    -DSECP256K1_INSTALL=ON \\\n    -DSECP256K1_USE_ASM=ON\ncmake --build out/release -j$(nproc)\nsudo cmake --install out/release\nsudo ldconfig\n```\n\n### Use in your CMake project\n\n```cmake\nfind_package(secp256k1-fast REQUIRED)\ntarget_link_libraries(myapp PRIVATE secp256k1::fast)\n```\n\n### Use with pkg-config\n\n```bash\ng++ myapp.cpp $(pkg-config --cflags --libs ufsecp) -o myapp\n```\n\n---\n\n## secp256k1 GPU Acceleration (CUDA / OpenCL / Metal / ROCm)\n\n\u003e **Scope note:** The GPU backends are **not part of the Bitcoin Core secondary CPU backend PR**.\n\u003e The Bitcoin Core PR targets the CPU-only library as a compile-time drop-in secp256k1 replacement.\n\u003e GPU capabilities require opt-in build flags (`-DSECP256K1_BUILD_CUDA=ON` etc.) and are outside\n\u003e the scope of consensus-critical signing paths. See the Bitcoin Core PR description for the\n\u003e exact build configuration targeted.\n\nUltrafastSecp256k1 provides full secp256k1 ECDSA + Schnorr sign/verify on GPU across four backends (CUDA, OpenCL, Metal, ROCm). As of February 2026, no other open-source library was known to the authors to cover all four backends; corrections are welcome ([open an issue](https://github.com/shrec/UltrafastSecp256k1/issues)):\n\n\u003cdetails\u003e\n\u003csummary\u003eGPU Performance tables (diagnostic — unverified against current build, not release-grade evidence. See \u003ccode\u003ecanonical_numbers.json\u003c/code\u003e \u003ccode\u003egpu_throughput.status\u003c/code\u003e field.)\u003c/summary\u003e\n\n| Backend | Hardware | kG/s | ECDSA Sign | ECDSA Verify | Schnorr Sign | Schnorr Verify | FROST Verify |\n|---------|----------|------|------------|--------------|--------------|----------------|-------------|\n| **CUDA** | RTX 5060 Ti | 4.59 M/s | 4.88 M/s | **4.05 M/s** | 3.66 M/s | **5.38 M/s** | **1.34 M/s** |\n| **OpenCL** | RTX 5060 Ti | 3.86 M/s | -- | 2.44 M/s* | -- | 2.82 M/s* | — |\n| **Metal** | Apple M3 Pro | 0.33 M/s | -- | -- | -- | -- |\n| **ROCm (HIP)** | AMD GPUs | Portable | -- | -- | -- | -- |\n\n*CUDA 12.0, sm_86;sm_89, batch=16K signatures, measured on RTX 5060 Ti. The CUDA path uses our own hybrid GPU execution model, which improved end-to-end throughput by more than 10% during optimization. Metal 2.4, 8x32-bit Comba limbs, 18 GPU cores. (\\*) OpenCL ECDSA/Schnorr verify uses extended kernel with lazy-loaded runtime compilation.*\n\n### CUDA Core ECC Operations (Kernel-Only Throughput)\n\n| Operation | Time/Op | Throughput |\n|-----------|---------|------------|\n| Field Mul | 0.2 ns | 4,142 M/s |\n| Field Add | 0.2 ns | 4,130 M/s |\n| Field Inv | 10.2 ns | 98.35 M/s |\n| Point Add | 1.6 ns | 619 M/s |\n| Point Double | 0.8 ns | 1,282 M/s |\n| Scalar Mul (Pxk) | 225.8 ns | 4.43 M/s |\n| Generator Mul (Gxk) | 217.7 ns | 4.59 M/s |\n| Batch Inv (Montgomery) | 2.9 ns | 340 M/s |\n| Jac-\u003eAffine (per-pt) | 14.9 ns | 66.9 M/s |\n\n### GPU Signature Operations (ECDSA + Schnorr)\n\n| Operation | Time/Op | Throughput | Protocol | Δ vs prev |\n|-----------|---------|------------|----------|----------|\n| **ECDSA Sign** | **204.8 ns** | **4.88 M/s** | RFC 6979 + low-S | — |\n| **ECDSA Verify** | **246.7 ns** | **4.05 M/s** | Shamir + GLV | **+66%** |\n| **ECDSA Sign+Recid** | **311.5 ns** | **3.21 M/s** | Recoverable (EIP-155) | — |\n| **Schnorr Sign** | **273.4 ns** | **3.66 M/s** | BIP-340 | — |\n| **Schnorr Verify** | **185.9 ns** | **5.38 M/s** | BIP-340 + GLV | **+91%** |\n| **FROST Partial Verify** | **748.9 ns** | **1.34 M/s** | t-of-n threshold | New |\n\n### CUDA vs OpenCL Comparison (RTX 5060 Ti)\n\n| Operation | CUDA | OpenCL | Winner |\n|-----------|------|--------|--------|\n| Field Mul | 0.2 ns | 0.2 ns | Tie |\n| Field Inv | 10.2 ns | 14.3 ns | **CUDA 1.40x** |\n| Point Double | 0.8 ns | 0.9 ns | **CUDA 1.13x** |\n| Point Add | 1.6 ns | 1.6 ns | Tie |\n| kG (Generator Mul) | 217.7 ns | 258.9 ns | **CUDA 1.19x** |\n| BIP352 Pipeline | 91.0 ns | 93.6 ns | **CUDA 1.03x** |\n\n*Benchmarks: 2026-02-14, Linux x86_64, NVIDIA Driver 580.126.09. Both kernel-only (no buffer allocation/copy overhead).*\n\n### Apple Metal (M3 Pro) -- Kernel-Only\n\n| Operation | Time/Op | Throughput |\n|-----------|---------|------------|\n| Field Mul | 1.9 ns | 527 M/s |\n| Field Inv | 106.4 ns | 9.40 M/s |\n| Point Add | 10.1 ns | 98.6 M/s |\n| Point Double | 5.1 ns | 196 M/s |\n| Scalar Mul (Pxk) | 2.94 us | 0.34 M/s |\n| Generator Mul (Gxk) | 3.00 us | 0.33 M/s |\n\n*Metal 2.4, 8x32-bit Comba limbs, Apple M3 Pro (18 GPU cores, Unified Memory 18 GB)*\n\n\u003c/details\u003e\n\n---\n\n## secp256k1 ECDSA \u0026 Schnorr Signatures (BIP-340, RFC 6979)\n\nFull signature support across CPU and GPU:\n\n- **ECDSA**: RFC 6979 deterministic nonces, low-S normalization, DER/Compact encoding, public key recovery (recid)\n- **Schnorr**: BIP-340 compliant -- tagged hashing, x-only public keys\n- **Batch verification**: ECDSA and Schnorr batch verify\n- **Multi-scalar**: Shamir's trick (k_1xG + k_2xQ) for fast verification\n\n### CPU Signature Benchmarks (x86-64, Clang 19, AVX2, Release) [archived — see docs/bench_unified_2026-05-30_gcc14_x86-64.json for current GCC 14.2.0 numbers]\n\n| Operation | Time | Throughput |\n|-----------|------:|----------:|\n| ECDSA Sign (RFC 6979) | 8.5 us | 118,000 op/s |\n| ECDSA Verify | 23.6 us | 42,400 op/s |\n| Schnorr Sign (BIP-340) | 6.8 us | 146,000 op/s |\n| Schnorr Verify (BIP-340) | 24.0 us | 41,600 op/s |\n| Key Generation (CT) | 9.5 us | 105,500 op/s |\n| Key Generation (fast) | 5.5 us | 182,000 op/s |\n| ECDH | 23.9 us | 41,800 op/s |\n\n*All rows above are the FAST (variable-time) path — NOT the production CT signing path. Schnorr sign is ~25% faster than ECDSA sign due to simpler nonce derivation. Measured single-core, pinned, Clang 19 — **archived 2026-02-21, NOT comparable to the current GCC 14.2.0 canonical data** below: [docs/bench_unified_2026-05-30_gcc14_x86-64.json](docs/bench_unified_2026-05-30_gcc14_x86-64.json).*\n\n---\n\n## Constant-Time secp256k1 (Side-Channel Resistance)\n\nThe `ct::` namespace provides constant-time operations for secret-key material -- no secret-dependent branches or memory access patterns:\n\n| Operation | FAST | CT | CT overhead |\n|-----------|-----:|---:|---:|\n| Scalar Mul (k×P) | 35,593 ns | 39,056 ns | 1.10× |\n| Generator Mul (k×G) | 9,200 ns | 15,347 ns | 1.67× |\n| Scalar Inverse | — | 2,503 ns | CT-only |\n| Point Add (complete) | — | 400 ns | CT-only |\n| ECDSA sign (end-to-end) | 22,316 ns | 22,501 ns | **0.83%** |\n| Schnorr sign (end-to-end) | 17,976 ns | 17,953 ns | **≈0.00%** |\n\n*GCC 14.2.0, Intel i5-14400F, turbo disabled, CPU-pinned. Source: [`docs/bench_unified_2026-05-30_gcc14_x86-64.json`](docs/bench_unified_2026-05-30_gcc14_x86-64.json)*\n\n**CT layer provides:** `ct::field_mul`, `ct::field_inv`, `ct::scalar_mul`, `ct::point_add_complete`, `ct::point_dbl`\n\n**Use the CT layer for**: private key operations, signing, nonce generation, ECDH.\n**Use the FAST layer for**: verification, public key derivation, batch processing, benchmarks.\n\nSee [THREAT_MODEL.md](docs/THREAT_MODEL.md) for a full layer-by-layer risk assessment.\n\n### CT Evidence \u0026 Methodology\n\n| Evidence | Scope | Status |\n|----------|-------|--------|\n| **No secret-dependent branches** | All `ct::` functions | [OK] Enforced by design, verified via Clang-Tidy checks |\n| **No secret-dependent memory access** | All `ct::` table lookups use constant-index cmov | [OK] |\n| **ASan + UBSan CI** | Every push -- catches undefined behavior in CT paths | [OK] CI |\n| **Timing tests (dudect)** | CPU field/scalar ops | [OK] Implemented in CI + manual deep-assurance + native ARM64 |\n| **Deterministic CT verification** | `ct-verif` LLVM + Valgrind CT | [OK] Implemented |\n\n**Assumptions:** CT guarantees depend on compiler not introducing secret-dependent branches during optimization. Builds use `-O2` with Clang; MSVC may require additional flags. Micro-architectural side channels (Spectre, power analysis) are outside current scope -- see [THREAT_MODEL.md](docs/THREAT_MODEL.md).\n\n---\n\n## Zero-Knowledge Proofs (Schnorr Sigma, DLEQ, Bulletproofs)\n\nUltrafastSecp256k1 provides ZK proof primitives over the secp256k1 curve:\n\n| Proof Type | Prove | Verify | Proof Size | Use Cases |\n|------------|-------|--------|------------|-----------|\n| **Knowledge Proof** | 20.3 us | 21.8 us | 64 bytes | Prove knowledge of discrete log (x: P = x*G) |\n| **DLEQ Proof** | 40.0 us | 56.4 us | 64 bytes | Prove log_G(P) == log_H(Q) -- VRFs, adaptor sigs, atomic swaps |\n| **Bulletproof Range** | 13,467 us | 2,634 us | ~620 bytes | Prove committed value in [0, 2^64) -- Confidential Transactions |\n\n**Security model:**\n- All proving operations use the **CT layer** (constant-time, side-channel resistant)\n- All verification uses the **FAST layer** (variable-time; public inputs only — no secret material)\n- Non-interactive via **Fiat-Shamir** (tagged SHA-256)\n- Nothing-up-my-sleeve generators for Bulletproofs (no trusted setup)\n\n**API:** `#include \u003csecp256k1/zk.hpp\u003e` -- namespace `secp256k1::zk`\n\n```cpp\n// Knowledge proof: prove you know x such that P = x*G\nauto proof = zk::knowledge_prove(secret, pubkey, msg, aux_rand);\nbool ok = zk::knowledge_verify(proof, pubkey, msg);\n\n// DLEQ: prove log_G(P) == log_H(Q)\nauto dleq = zk::dleq_prove(secret, G, H, P, Q, aux_rand);\nbool ok = zk::dleq_verify(dleq, G, H, P, Q);\n\n// Bulletproof range proof: prove committed value in [0, 2^64)\nauto rp = zk::range_prove(value, blinding, commitment, aux_rand);\nbool ok = zk::range_verify(commitment, rp);\n```\n\n*Benchmarks: i7-14400F, 11 passes, pinned core, median. See [docs/BENCHMARKS.md](docs/BENCHMARKS.md).*\n\n---\n\n## secp256k1 Benchmarks -- Cross-Platform Comparison\n\n### CPU: x86-64 vs ARM64 vs RISC-V\n\n| Operation | x86-64 (Clang 19, AVX2) [archived] | ARM64 (Cortex-A76) | RISC-V (Milk-V Mars) |\n|-----------|-------------------------:|--------------------:|---------------------:|\n| Field Mul | 17 ns | 74 ns | 95 ns |\n| Field Square | 14 ns | 50 ns | 70 ns |\n| Field Add | 1 ns | 8 ns | 11 ns |\n| Field Inverse | 1 us | 2 us | 4 us |\n| Point Add | 159 ns | 992 ns | 1 us |\n| Generator Mul (kxG) | 5 us | 14 us | 33 us |\n| Scalar Mul (kxP) | 25 us | 131 us | 154 us |\n\n### GPU: CUDA vs OpenCL vs Metal\n\n| Operation | CUDA (RTX 5060 Ti) | OpenCL (RTX 5060 Ti) | Metal (M3 Pro) |\n|-----------|--------------------:|---------------------:|---------------:|\n| Field Mul | 0.2 ns | 0.2 ns | 1.9 ns |\n| Field Inv | 10.2 ns | 14.3 ns | 106.4 ns |\n| Point Add | 1.6 ns | 1.6 ns | 10.1 ns |\n| Generator Mul (Gxk) | 217.7 ns | 295.1 ns | 3.00 us |\n\n### Embedded: ESP32-S3 vs ESP32 vs STM32\n\n| Operation | ESP32-S3 LX7 (240 MHz) | ESP32 LX6 (240 MHz) | STM32F103 (72 MHz) |\n|-----------|-------------------:|-------------------:|-------------------:|\n| Field Mul | 6,105 ns | 6,993 ns | 15,331 ns |\n| Field Square | 5,020 ns | 6,247 ns | 12,083 ns |\n| Field Add | 850 ns | 985 ns | 4,139 ns |\n| Field Inv | 2,524 us | 609 us | 1,645 us |\n| **Fast** Scalar x G | 5,226 us | 6,203 us | 37,982 us |\n| **CT** Scalar x G | 15,527 us | -- | -- |\n| **CT** Generator x k | 4,951 us | -- | -- |\n\n### Field Representation: 5x52 vs 4x64\n\n| Operation | 4x64 | 5x52 | Speedup |\n|-----------|------:|------:|--------:|\n| Multiplication | 42 ns | 15 ns | **2.76x** |\n| Squaring | 31 ns | 13 ns | **2.44x** |\n| Addition | 4.3 ns | 1.6 ns | **2.69x** |\n| Add chain (32 ops) | 286 ns | 57 ns | **5.01x** |\n\n*5x52 uses `__int128` lazy reduction -- ideal for 64-bit platforms.*\n\nFor full benchmark results, see [docs/BENCHMARKS.md](docs/BENCHMARKS.md).\n\n---\n\n## secp256k1 on Embedded (ESP32 / STM32 / ARM Cortex-M)\n\nUltrafastSecp256k1 runs on resource-constrained microcontrollers with **portable C++ (no `__int128`, no assembly required)**:\n\n- **ESP32-S3** (Xtensa LX7 @ 240 MHz): Fast scalar x G in 5.2 ms, **CT generator x k in 4.9 ms**\n- **ESP32-PICO-D4** (Xtensa LX6 @ 240 MHz): Scalar x G in 6.2 ms, CT layer available (44.8 ms CT)\n- **ESP32-C6** (RISC-V RV32IMAC @ 160 MHz): Scalar x G in ~14 ms, CT layer available\n- **ESP32-P4** (RISC-V HP dual-core @ 400 MHz): Scalar x G in ~3 ms, CT layer available\n- **STM32F103** (ARM Cortex-M3 @ 72 MHz): Scalar x G in 38 ms with ARM inline assembly (UMULL/ADDS/ADCS)\n- **Android ARM64** (RK3588, Cortex-A76 @ 2.256 GHz): Scalar x G in 14 us, Scalar x P in 131 us, ECDSA Sign 30 us\n\nAll 37 library tests pass on every embedded target. See [examples/esp32_test/](examples/esp32_test/) and [examples/stm32_test/](examples/stm32_test/).\n\n### Porting to New Platforms\n\nSee [PORTING.md](docs/PORTING.md) for a step-by-step checklist to add new CPU architectures, embedded targets, or GPU backends.\n\n---\n\n## WASM secp256k1 (Browser \u0026 Node.js)\n\nWebAssembly build via Emscripten -- runs secp256k1 in any modern browser or Node.js:\n\n```bash\n./ci/build_wasm.sh        # -\u003e build/wasm/dist/\n```\n\nOutput: `secp256k1_wasm.wasm` + `secp256k1.mjs` (ES6 module with TypeScript declarations).\nSee [wasm/README.md](bindings/wasm/README.md) for JavaScript/TypeScript integration.\n\n---\n\n## secp256k1 Batch Modular Inverse (Montgomery Trick)\n\nAll backends include **batch modular inversion** -- a critical building block for Jacobian-\u003eAffine conversion:\n\n| Backend | Function | Notes |\n|---------|----------|-------|\n| **CPU** | `fe_batch_inverse(FieldElement*, size_t)` | Montgomery trick with scratch buffer |\n| **CUDA** | `batch_inverse_montgomery` / `batch_inverse_kernel` | GPU Montgomery trick kernel |\n| **Metal** | `batch_inverse` | Chunked parallel threadgroups |\n| **OpenCL** | Inline PTX inverse | Batch via host orchestration |\n\n**Algorithm**: Montgomery batch inverse computes N field inversions using only **1 modular inversion + 3(N-1) multiplications**, amortizing the expensive inversion across the entire batch.\n\nFor N=1024: ~500x cheaper than individual inversions. A single field inversion costs ~3.5 us (Fermat), while batch amortizes to ~7 ns per element.\n\n### Mixed Addition (Jacobian + Affine)\n\nBranchless mixed addition (`add_mixed_inplace`) uses the **madd-2007-bl** formula: **7M + 4S** (vs 11M + 5S for full Jacobian add).\n\n```cpp\n#include \u003csecp256k1/point.hpp\u003e\nusing namespace secp256k1::fast;\n\nPoint P = Point::generator();\nFieldElement gx = P.x(), gy = P.y();\n\n// Compute 2G using mixed add (7M + 4S)\nPoint Q = Point::generator();\nQ.add_mixed_inplace(gx, gy);  // Q = G + G = 2G\n\n// Batch walk: P, P+G, P+2G, ...\nPoint walker = P;\nfor (int i = 0; i \u003c 1000; ++i) {\n    walker.add_mixed_inplace(gx, gy);  // walker += G each step\n}\n```\n\n### GPU Pattern: H-Product Serial Inversion\n\nProduction GPU apps use a memory-efficient variant: instead of storing full Z coordinates, `jacobian_add_mixed_h` returns **H = U2 - X1** separately. Since Z_k = Z_0 * H_0 * H_1 * … * H_{k-1}, the entire Z chain is invertible from H values + initial Z_0.\n\n**Cost**: 1 Fermat inversion + 2N multiplications per thread (vs N Fermat inversions naively).\n\n\u003e See `apps/secp256k1_search_gpu_only/gpu_only.cu` (step kernel) + `unified_split.cuh` (batch inversion kernel)\n\n---\n\n## secp256k1 Stable C ABI (`ufsecp`) -- FFI Bindings\n\nStarting with **v3.4.0**, UltrafastSecp256k1 ships a stable C ABI -- `ufsecp` -- designed for FFI bindings (C#, Python, Rust, Go, Java, Node.js, Dart, React Native, PHP, Ruby, etc.):\n\n```\n+--------------------------------------------------+\n|                  Your Application                |\n|          (C, C#, Python, Go, Rust, …)            |\n+------------------+-------------------------------+\n                   |  ufsecp C ABI (45 functions)\n+------------------▼-------------------------------+\n|           ufsecp.dll / libufsecp.so              |\n|  Opaque ctx  |  Error model  |  ABI versioning   |\n+--------------+---------------+-------------------+\n|   FAST layer (variable-time public ops)          |\n+--------------------------------------------------+\n|   CT layer (constant-time secret-key ops)        |\n+--------------------------------------------------+\n```\n\n**Default behavior:**\n- **C ABI (`ufsecp`)**: Defaults to safe behavior -- all secret-key operations (sign, derive, ECDH) use CT internally. No configuration needed.\n- **C++ API**: Exposes both `fast::` and `ct::` namespaces -- the developer chooses explicitly per call site.\n\n### Quick Start (C)\n\n```c\n#include \"ufsecp.h\"\n\nufsecp_ctx* ctx = NULL;\nufsecp_ctx_create(\u0026ctx);\n\n// Generate keypair\nunsigned char seckey[32], pubkey[33];\nufsecp_keygen(ctx, seckey, pubkey);\n\n// ECDSA sign\nunsigned char msg[32] = { /* SHA-256 hash */ };\nunsigned char sig[64];\nufsecp_ecdsa_sign(ctx, seckey, msg, sig);\n\n// Verify\nint valid = 0;\nufsecp_ecdsa_verify(ctx, pubkey, 33, msg, sig, \u0026valid);\n\nufsecp_ctx_destroy(ctx);\n```\n\n### GPU C ABI (`ufsecp_gpu`)\n\nStarting with **v3.3.0**, the GPU layer is fully accessible from any FFI language via `ufsecp_gpu.h`:\n\n| Category | Functions |\n|----------|-----------|\n| **Discovery** | `gpu_backend_count`, `gpu_backend_name`, `gpu_is_available`, `gpu_device_count`, `gpu_device_info` |\n| **Lifecycle** | `gpu_ctx_create`, `gpu_ctx_destroy`, `gpu_last_error`, `gpu_last_error_msg`, `gpu_error_str` |\n| **Batch Ops** | `gpu_generator_mul_batch`, `gpu_ecdsa_verify_batch`, `gpu_schnorr_verify_batch`, `gpu_ecdh_batch`, `gpu_hash160_pubkey_batch`, `gpu_msm`, `gpu_frost_verify_partial_batch`, `gpu_ecrecover_batch` |\n\n| Batch Operation | CUDA | OpenCL | Metal |\n|----------------|------|--------|-------|\n| `generator_mul_batch` | [OK] | [OK] | [OK] |\n| `ecdsa_verify_batch` | [OK] | [OK] | [OK] |\n| `schnorr_verify_batch` | [OK] | [OK] | [OK] |\n| `ecdh_batch` | [OK] | [OK] | [OK] |\n| `hash160_pubkey_batch` | [OK] | [OK] | [OK] |\n| `msm` | [OK] | [OK] | [OK] |\n| `frost_verify_partial_batch` | [OK] | [OK] | [OK] |\n| `ecrecover_batch` | [OK] | [..] temporary stub | [..] temporary stub |\n\nSee [ufsecp_gpu.h](include/ufsecp/ufsecp_gpu.h) and [GPU Validation Matrix](docs/GPU_VALIDATION_MATRIX.md) for details.\n\n### CPU C ABI Coverage\n\n| Category | Functions |\n|----------|-----------|\n| **Context** | `ctx_create`, `ctx_destroy`, `selftest`, `last_error` |\n| **Keys** | `keygen`, `seckey_verify`, `pubkey_create`, `pubkey_parse`, `pubkey_serialize` |\n| **ECDSA** | `ecdsa_sign`, `ecdsa_sign_batch`, `ecdsa_verify`, `ecdsa_sign_der`, `ecdsa_verify_der`, `ecdsa_recover` |\n| **Schnorr** | `schnorr_sign`, `schnorr_sign_batch`, `schnorr_verify` |\n| **SHA-256** | `sha256` (SHA-NI accelerated) |\n| **ECDH** | `ecdh_compressed`, `ecdh_xonly`, `ecdh_raw` |\n| **BIP-32** | `bip32_from_seed`, `bip32_derive_child`, `bip32_serialize` |\n| **Address** | `address_p2pkh`, `address_p2wpkh`, `address_p2tr` |\n| **WIF** | `wif_encode`, `wif_decode` |\n| **Tweak** | `pubkey_tweak_add`, `pubkey_tweak_mul` |\n| **Version** | `version`, `abi_version`, `version_string` |\n\nSee [SUPPORTED_GUARANTEES.md](include/ufsecp/SUPPORTED_GUARANTEES.md) for Tier 1/2/3 stability guarantees.\n\n---\n\n## secp256k1 Use Cases\n\n- **Transaction Signing \u0026 Verification** -- CPU constant-time signing + GPU-accelerated batch verification across Bitcoin, Ethereum, and 25+ blockchains\n- **Batch Signature Verification** -- verify thousands of ECDSA/Schnorr signatures per second for block validation\n- **HD Wallet Key Derivation** -- BIP-32/44 hierarchical deterministic derivation with 27-coin address generation\n- **Embedded IoT Signing** -- ESP32 and STM32 on-device key generation and transaction signing\n- **High-Throughput Indexing** -- GPU-accelerated public key derivation for address indexing services\n- **Zero-Knowledge Proof Systems** -- Pedersen commitments, adaptor signatures for ZK protocols\n- **Multi-Party Computation** -- MuSig2 (BIP-327) and FROST threshold signing\n- **Cross-Platform Cryptographic Services** -- single codebase across server (CUDA), desktop (OpenCL/Metal), mobile (ARM64), browser (WASM), and embedded (ESP32/STM32)\n- **Cryptographic Research \u0026 Benchmarking** -- field/group operation microbenchmarks, algorithm variant comparison\n\n\u003e ### Testers Wanted\n\u003e We need community testers for platforms we cannot fully validate in CI:\n\u003e - **iOS** -- Build \u0026 run on real iPhone/iPad hardware with Xcode\n\u003e - **AMD GPU (ROCm/HIP)** -- Test on AMD Radeon RX / Instinct GPUs\n\u003e\n\u003e [Open an issue](https://github.com/shrec/UltrafastSecp256k1/issues) with your results!\n\n---\n\n## Building secp256k1 from Source (CMake)\n\n### Prerequisites\n\n- CMake 3.18+\n- C++20 compiler (GCC 11+, Clang/LLVM 15+, MSVC 2022+)\n- CUDA Toolkit 12.0+ (optional, for GPU)\n- Ninja (recommended)\n\n### CPU-Only Build\n\n```bash\ncmake -S . -B out/release -G Ninja -DCMAKE_BUILD_TYPE=Release\ncmake --build out/release -j\n```\n\n### With CUDA GPU Support\n\n```bash\ncmake -S . -B out/release -G Ninja \\\n  -DCMAKE_BUILD_TYPE=Release \\\n  -DSECP256K1_BUILD_CUDA=ON\ncmake --build out/release -j\n```\n\n### WebAssembly (Emscripten)\n\n```bash\n./ci/build_wasm.sh        # -\u003e build/wasm/dist/\n```\n\n### iOS (XCFramework)\n\n```bash\n./ci/build_xcframework.sh  # -\u003e build/xcframework/output/\n```\n\nUniversal XCFramework (arm64 device + arm64 simulator). Also available via **Swift Package Manager** and **CocoaPods**.\n\n### Local ARM64 / RISC-V QEMU Smoke\n\n```bash\n# ARM64 cross-build + QEMU smoke\nbash ./ci/run-qemu-smoke.sh arm64\n\n# RISC-V cross-build + QEMU smoke\nbash ./ci/run-qemu-smoke.sh riscv64\n\n# Both architectures\nbash ./ci/run-qemu-smoke.sh all\n```\n\nThis local helper runs the same cross-arch smoke surface now used in CI:\n`run_selftest smoke`, `test_bip324_standalone`, `bench_kP`, and `bench_bip324`.\nInstall the corresponding cross toolchain, libc sysroot, `qemu-user-static`, and `ninja-build` first.\n\nIf you prefer the existing local CI entry point, the same coverage is also available as:\n\n```bash\nbash ./ci/local-ci.sh --job qemu-smoke\n\n# Optional: limit to one architecture\nSECP256K1_QEMU_SMOKE_TARGET=arm64 bash ./ci/local-ci.sh --job qemu-smoke\nSECP256K1_QEMU_SMOKE_TARGET=riscv64 bash ./ci/local-ci.sh --job qemu-smoke\n```\n\n### Build Options\n\n| Option | Default | Description |\n|--------|---------|-------------|\n| `SECP256K1_USE_ASM` | ON | Assembly optimizations (x64/ARM64/RISC-V) |\n| `SECP256K1_BUILD_CUDA` | OFF | CUDA GPU support |\n| `SECP256K1_BUILD_OPENCL` | OFF | OpenCL GPU support |\n| `SECP256K1_BUILD_ROCM` | OFF | ROCm/HIP GPU support (AMD) |\n| `SECP256K1_BUILD_TESTS` | ON | Test suite |\n| `SECP256K1_BUILD_BENCH` | ON | Benchmarks |\n| `SECP256K1_GLV_WINDOW_WIDTH` | platform | GLV window width (4-7); default 5 on x86/ARM/RISC-V, 4 on ESP32/WASM |\n| `SECP256K1_RISCV_USE_VECTOR` | ON | RVV vector extension (RISC-V) |\n\nFor detailed build instructions, see [docs/BUILDING.md](docs/BUILDING.md).\n\n---\n\n## secp256k1 Quick Start (C++ Examples)\n\n### Basic Point Operations\n\n```cpp\n#include \u003csecp256k1/field.hpp\u003e\n#include \u003csecp256k1/point.hpp\u003e\n#include \u003csecp256k1/scalar.hpp\u003e\n#include \u003ciostream\u003e\n\nusing namespace secp256k1::fast;\n\nint main() {\n    // Public key derivation: private_key x G = public_key\n    auto generator = Point::generator();\n    auto private_key = Scalar::from_hex(\n        \"E9873D79C6D87DC0FB6A5778633389F4453213303DA61F20BD67FC233AA33262\"\n    );\n    auto public_key = generator * private_key;\n\n    std::cout \u003c\u003c \"Public Key X: \" \u003c\u003c public_key.x().to_hex() \u003c\u003c \"\\n\";\n    std::cout \u003c\u003c \"Public Key Y: \" \u003c\u003c public_key.y().to_hex() \u003c\u003c \"\\n\";\n    return 0;\n}\n```\n\n```bash\ng++ -std=c++20 example.cpp -lufsecp -o example \u0026\u0026 ./example\n```\n\n### GPU Batch Multiplication\n\n```cpp\n#include \u003csecp256k1_cuda/batch_operations.hpp\u003e\n#include \u003csecp256k1/point.hpp\u003e\n#include \u003cvector\u003e\n\nusing namespace secp256k1::fast;\n\nint main() {\n    std::vector\u003cPoint\u003e base_points(1'000'000, Point::generator());\n    std::vector\u003cScalar\u003e scalars(1'000'000);\n    for (auto\u0026 s : scalars) s = Scalar::random();\n\n    cuda::BatchConfig config{.device_id = 0, .threads_per_block = 256, .streams = 4};\n    auto results = cuda::batch_multiply(base_points, scalars, config);\n\n    std::cout \u003c\u003c \"Processed \" \u003c\u003c results.size() \u003c\u003c \" point multiplications\\n\";\n    return 0;\n}\n```\n\n---\n\n## secp256k1 Security Model (FAST vs CT)\n\nTwo security profiles are **always active** -- no flag-based selection:\n\n### FAST Profile (Default)\n\n- Maximum throughput, variable-time algorithms\n- Use for: verification, batch processing, public key derivation, benchmarking\n- [!] **Not safe for secret key operations** -- timing side-channels possible\n\n### CT / Hardened Profile (`ct::` namespace)\n\n- Constant-time arithmetic -- no secret-dependent branches or memory access\n- ~1.1–1.9× performance penalty vs FAST for primitive operations (see CT overhead table in docs/BENCHMARKS.md; release-grade measurement: `docs/bench_unified_2026-05-30_gcc14_x86-64.json`, CT overhead table, GCC 14.2.0)\n- Use for: signing, private key handling, nonce generation, ECDH\n\n**Choose the appropriate profile for your use case.** Using FAST with secret data is a security vulnerability.\nSee [THREAT_MODEL.md](docs/THREAT_MODEL.md) for full details.\n\n---\n\n## secp256k1 Supported Coins (27 Blockchains)\n\n\u003cdetails\u003e\n\u003csummary\u003eSupported Coins (out of scope for Bitcoin Core CPU backend review)\u003c/summary\u003e\n\n| # | Coin | Ticker | Address Types | BIP-44 |\n|---|------|--------|---------------|--------|\n| 1 | **Bitcoin** | BTC | P2PKH, P2WPKH (Bech32), P2TR (Bech32m) | m/86'/0' |\n| 2 | **Ethereum** | ETH | EIP-55 Checksum | m/44'/60' |\n| 3 | **Litecoin** | LTC | P2PKH, P2WPKH | m/84'/2' |\n| 4 | **Dogecoin** | DOGE | P2PKH | m/44'/3' |\n| 5 | **Bitcoin Cash** | BCH | P2PKH | m/44'/145' |\n| 6 | **Bitcoin SV** | BSV | P2PKH | m/44'/236' |\n| 7 | **Zcash** | ZEC | P2PKH (transparent) | m/44'/133' |\n| 8 | **Dash** | DASH | P2PKH | m/44'/5' |\n| 9 | **DigiByte** | DGB | P2PKH, P2WPKH | m/44'/20' |\n| 10 | **Namecoin** | NMC | P2PKH | m/44'/7' |\n| 11 | **Peercoin** | PPC | P2PKH | m/44'/6' |\n| 12 | **Vertcoin** | VTC | P2PKH, P2WPKH | m/44'/28' |\n| 13 | **Viacoin** | VIA | P2PKH | m/44'/14' |\n| 14 | **Groestlcoin** | GRS | P2PKH, P2WPKH | m/44'/17' |\n| 15 | **Syscoin** | SYS | P2PKH | m/44'/57' |\n| 16 | **BNB Smart Chain** | BNB | EIP-55 | m/44'/60' |\n| 17 | **Polygon** | MATIC | EIP-55 | m/44'/60' |\n| 18 | **Avalanche** | AVAX | EIP-55 (C-Chain) | m/44'/60' |\n| 19 | **Fantom** | FTM | EIP-55 | m/44'/60' |\n| 20 | **Arbitrum** | ARB | EIP-55 | m/44'/60' |\n| 21 | **Optimism** | OP | EIP-55 | m/44'/60' |\n| 22 | **Ravencoin** | RVN | P2PKH | m/44'/175' |\n| 23 | **Flux** | FLUX | P2PKH | m/44'/19167' |\n| 24 | **Qtum** | QTUM | P2PKH | m/44'/2301' |\n| 25 | **Horizen** | ZEN | P2PKH | m/44'/121' |\n| 26 | **Bitcoin Gold** | BTG | P2PKH | m/44'/156' |\n| 27 | **Komodo** | KMD | P2PKH | m/44'/141' |\n\nAll EVM chains (ETH, BNB, MATIC, AVAX, FTM, ARB, OP) share the same address format (EIP-55 checksummed hex).\n\n\u003c/details\u003e\n\n---\n\n## secp256k1 Architecture\n\n### Library Stack\n\n```\n+----------------------------------------------------------+\n|           Language Bindings (FFI / C ABI)                 |\n|  Python | Node.js | Rust | Go | C# | Java | Swift | PHP |\n+----------------------------------------------------------+\n                          |\n                   Bindings Layer\n                  (ctypes / koffi / cgo\n                   JNA / P/Invoke / FFI)\n                          |\n+----------------------------------------------------------+\n|            UltrafastSecp256k1 Core (C++20)                |\n|                                                          |\n|  Field Arithmetic | Scalar Ops | Point Ops | GLV/Endomo  |\n|  ECDSA | Schnorr BIP-340 | ECDH | MuSig2 | FROST       |\n|  Pedersen | Taproot | BIP-32 HD | Adaptor Sigs | ZK      |\n|                                                          |\n|  [FAST layer]              [CT layer]                    |\n|  Variable-time             Constant-time                 |\n|  Max throughput            Side-channel safe              |\n+----------------------------------------------------------+\n                          |\n+----------+----------+----------+----------+--------------+\n|   CPU    |   CUDA   |  OpenCL  |  Metal   |  Embedded    |\n|          |          |          |          |              |\n| x86_64   | NVIDIA   | AMD/NVIDIA| Apple   | ESP32-S3     |\n| ARM64    | sm_50+   | any GPU  | Silicon | ESP32-C6     |\n| RISC-V   |          |          |          | STM32        |\n| WASM     |          |          |          | Cortex-M     |\n+----------+----------+----------+----------+--------------+\n```\n\n### Hardware Compatibility\n\n| Platform | Architecture | Backend | Status |\n|----------|-------------|---------|--------|\n| **Desktop CPU** | x86_64 (Intel / AMD) | CPU | [OK] Stable |\n| **Desktop CPU** | ARM64 (Apple Silicon, Ampere) | CPU | [OK] Stable |\n| **Desktop CPU** | RISC-V RV64GC | CPU | [OK] Stable |\n| **Raspberry Pi** | ARM64 (BCM2710, Zero 2 W) | CPU | [..] Testing |\n| **NVIDIA GPU** | RTX / GTX / Tesla (sm_50+) | CUDA 12+ | [OK] Stable (8/8 GPU C ABI ops) |\n| **AMD GPU** | RDNA / CDNA | OpenCL | [OK] Broad (7/8 GPU C ABI ops; `ecrecover_batch` pending) |\n| **AMD GPU** | RDNA / CDNA | ROCm/HIP | [!] Beta |\n| **Apple GPU** | Apple Silicon (M1/M2/M3/M4) | Metal | [..] Experimental (7/8 GPU C ABI ops; `ecrecover_batch` pending) |\n| **Any GPU** | OpenCL 1.2+ compatible | OpenCL | [OK] Broad (7/8 GPU C ABI ops; `ecrecover_batch` pending) |\n| **ESP32-S3** | Xtensa LX7 @ 240 MHz | CPU | [OK] Tested |\n| **ESP32-P4** | RISC-V @ 400 MHz | CPU | [OK] Supported |\n| **ESP32-C6** | RISC-V (single-core) | CPU | [OK] Supported |\n| **STM32** | ARM Cortex-M3/M4 | CPU | [..] Experimental |\n| **WebAssembly** | WASM (Emscripten) | CPU | [OK] Stable |\n| **Android** | ARM64 (NDK r27c) | CPU | [OK] Stable |\n| **iOS** | ARM64 (Xcode) | CPU | [OK] Stable |\n\n\u003e **GPU C ABI ops**: generator_mul_batch, ecdsa_verify_batch, schnorr_verify_batch, ecdh_batch, hash160_pubkey_batch, msm, frost_verify_partial_batch, ecrecover_batch. See [GPU Validation Matrix](docs/GPU_VALIDATION_MATRIX.md) for per-backend details.\n\n### Embedded Targets\n\n| Target | MCU | Clock | Scalar x G | Flash | RAM |\n|--------|-----|-------|-----------|-------|-----|\n| ESP32-S3 | Xtensa LX7 (dual) | 240 MHz | 5.2 ms | ~120 KB | ~8 KB |\n| ESP32-PICO-D4 | Xtensa LX6 (dual) | 240 MHz | 6.2 ms | ~120 KB | ~8 KB |\n| ESP32-P4 | RISC-V | 400 MHz | ~3 ms | ~120 KB | ~8 KB |\n| ESP32-C6 | RISC-V (single) | 160 MHz | ~12 ms | ~120 KB | ~8 KB |\n| STM32F103 | Cortex-M3 | 72 MHz | 38 ms | ~100 KB | ~6 KB |\n\n### Source Directory\n\n```\nUltrafastSecp256k1/\n+-- cpu/                 # CPU-optimized implementation\n|   +-- include/         # Public headers (field.hpp, scalar.hpp, point.hpp, ecdsa.hpp, schnorr.hpp)\n|   +-- src/             # Implementation (field_asm_x64.asm, field_asm_riscv64.S, ...)\n|   +-- fuzz/            # libFuzzer harnesses\n|   +-- tests/           # Unit tests\n+-- cuda/                # CUDA GPU acceleration\n+-- opencl/              # OpenCL GPU acceleration\n+-- metal/               # Apple Metal GPU acceleration\n+-- wasm/                # WebAssembly (Emscripten)\n+-- android/             # Android NDK (ARM64)\n+-- include/ufsecp/      # Stable C ABI\n+-- bindings/            # Language bindings (Rust, Python, Node.js, Go, C#, Java, ...)\n+-- examples/\n|   +-- c_example/       # C API usage\n|   +-- rust_example/    # Rust FFI example\n|   +-- python_example/  # Python ctypes example\n|   +-- nodejs_example/  # Node.js koffi example\n|   +-- go_example/      # Go cgo example\n|   +-- java_example/    # Java JNA example\n|   +-- esp32_test/      # ESP32-S3 Xtensa LX7 port\n|   +-- stm32_test/      # STM32F103 ARM Cortex-M3 port\n+-- docs/                # Documentation\n```\n\n---\n\n## secp256k1 Testing \u0026 Verification\n\n### Built-in Selftest\n\nEvery executable runs a deterministic **Known Answer Test (KAT)** on startup, covering all arithmetic operations:\n\n| Mode | Time | When | What |\n|------|------|------|------|\n| **smoke** | ~1-2s | App startup, embedded | Core KAT (10 scalar mul, field/scalar identities, boundary vectors) |\n| **ci** | ~30-90s | Every push (CI) | Smoke + cross-checks, bilinearity, NAF/wNAF, batch sweeps, algebraic stress |\n| **stress** | ~10-60min | Manual / release | CI + 1000 random scalar muls, 500 field triples, batch inverse up to 8192 |\n\n```cpp\n#include \"secp256k1/selftest.hpp\"\nusing namespace secp256k1::fast;\n\nSelftest(true, SelftestMode::smoke);              // Fast startup check\nSelftest(true, SelftestMode::ci);                  // Full CI suite\nSelftest(true, SelftestMode::stress, 0xDEADBEEF); // Deep-assurance / release with custom seed\n```\n\n### Sanitizer Builds\n\n```bash\ncmake --preset cpu-asan \u0026\u0026 cmake --build out/release/cpu-asan -j    # ASan + UBSan\ncmake --preset cpu-tsan \u0026\u0026 cmake --build out/release/cpu-tsan -j    # TSan (data races)\nctest --test-dir out/release/cpu-asan --output-on-failure\n```\n\n### Fuzz Testing\n\nlibFuzzer harnesses cover core arithmetic (`cpu/fuzz/`):\n\n| Target | What it tests |\n|--------|---------------|\n| `fuzz_field` | add/sub round-trip, mul identity, square, inverse |\n| `fuzz_scalar` | add/sub, mul identity, distributive law |\n| `fuzz_point` | on-curve check, negate, compress round-trip, dbl vs add |\n\n### Platform CI Coverage\n\n| Platform | Backend | Compiler | Status |\n|----------|---------|----------|--------|\n| Linux x64 | CPU | GCC 13 / Clang 17 | [OK] CI |\n| Linux x64 | CPU | Clang 17 (ASan+UBSan) | [OK] CI |\n| Linux x64 | CPU | Clang 17 (TSan) | [OK] CI |\n| Windows x64 | CPU | MSVC 2022 | [OK] CI |\n| macOS ARM64 | CPU + Metal | AppleClang | [OK] CI |\n| iOS ARM64 | CPU | Xcode | [OK] CI |\n| Android ARM64 | CPU | NDK r27c | [OK] CI |\n| WebAssembly | CPU | Emscripten | [OK] CI |\n| ROCm/HIP | CPU + GPU | ROCm 6.3 | [OK] CI |\n\n### Cross-Platform Audit Results\n\nThe `unified_audit_runner` executes exploit PoCs, constant-time analysis, differential\ntesting, standard vectors, fuzzing, protocol security, ABI safety, and performance validation.\n\n\u003e Current module counts and per-platform run results are generated automatically by\n\u003e `ci/sync_module_count.py` and are authoritative in\n\u003e [`audit/platform-reports/PLATFORM_AUDIT.md`](audit/platform-reports/PLATFORM_AUDIT.md).\n\u003e The table previously shown here was a snapshot from an earlier audit cycle and has been\n\u003e removed to avoid stale module-count contradictions — always refer to the live report.\n\n---\n\n## secp256k1 Benchmark Targets\n\n| Target | Description |\n|--------|-------------|\n| `bench_unified` | THE standard: full apple-to-apple vs libsecp256k1 + OpenSSL |\n| `bench_ct` | Fast-vs-CT overhead comparison |\n| `bench_field_52` | 5x52 field arithmetic micro-benchmarks |\n| `bench_field_26` | 10x26 field arithmetic micro-benchmarks |\n| `bench_kP` | Scalar multiplication (k*P) benchmarks |\n\n---\n\n## Research Statement\n\nThis library explores the **performance ceiling of secp256k1** across CPU architectures (x64, ARM64, RISC-V, Cortex-M, Xtensa) and GPUs (CUDA, OpenCL, Metal, ROCm). Zero external dependencies. Pure C++20.\n\n---\n\n## API Stability\n\n**C++ API**: Not yet stable. Breaking changes may occur before **v4.0**. Core layers (field, scalar, point, ECDSA, Schnorr) are production-ready with full audit coverage. Extended layers (MuSig2, FROST, Adaptor, Pedersen, ZK, Taproot, HD, Coins) are **Experimental** — implemented and covered by PoC exploit tests and CT verification; APIs may change. C++ API signatures may still evolve before v4.0.\n\n**C ABI (`ufsecp`)**: Stable from v3.4.0. ABI version tracked separately. See [SUPPORTED_GUARANTEES.md](include/ufsecp/SUPPORTED_GUARANTEES.md).\n\n---\n\n## Release Signing \u0026 Verification\n\nAll releases starting from **v3.15.0** are cryptographically signed using\n[Sigstore cosign](https://docs.sigstore.dev/) (keyless, GitHub OIDC identity).\nOlder historical releases remain unsigned but are preserved unchanged.\n\nEvery release includes:\n\n| Artifact | Purpose |\n|----------|---------|\n| `SHA256SUMS` | Checksums for all release archives |\n| `SHA256SUMS.sig` | Cosign signature of the manifest |\n| `SHA256SUMS.pem` | Signing certificate (Sigstore OIDC) |\n| `sbom.cdx.json` | CycloneDX Software Bill of Materials |\n| Per-archive `.sig` + `.pem` | Individual artifact signatures |\n\n### Verify checksums\n\n**Linux:**\n\n```bash\ncurl -LO https://github.com/shrec/UltrafastSecp256k1/releases/latest/download/SHA256SUMS\nsha256sum -c SHA256SUMS\n```\n\n**macOS:**\n\n```bash\nshasum -a 256 -c SHA256SUMS\n```\n\n**Windows (PowerShell):**\n\n```powershell\nGet-Content SHA256SUMS | ForEach-Object {\n  $parts = $_ -split '  '\n  $expected = $parts[0]; $file = $parts[1]\n  $actual = (Get-FileHash $file -Algorithm SHA256).Hash.ToLower()\n  if ($actual -eq $expected) { \"[OK] $file\" } else { \"[FAIL] $file\" }\n}\n```\n\n### Verify signature (cosign)\n\n```bash\ncosign verify-blob SHA256SUMS \\\n  --signature SHA256SUMS.sig \\\n  --certificate SHA256SUMS.pem \\\n  --certificate-identity-regexp \"github.com/shrec/UltrafastSecp256k1\" \\\n  --certificate-oidc-issuer https://token.actions.githubusercontent.com\n```\n\n| Supply Chain | Status |\n|-------------|--------|\n| SHA256SUMS for all artifacts | [OK] Every release |\n| Cosign / Sigstore manifest signing | [OK] v3.15.0+ |\n| Per-artifact Cosign signatures | [OK] v3.15.0+ |\n| SLSA Build Provenance (GitHub Attestation) | [OK] Every release |\n| CycloneDX SBOM | [OK] Every release |\n| Reproducible builds documentation | [OK] Dockerfile.reproducible |\n\n---\n\n## FAQ\n\n**Is UltrafastSecp256k1 a drop-in replacement for libsecp256k1?**\n\u003e No. It is an independent implementation with a different API. The C ABI (`ufsecp`) provides a stable FFI surface, but function signatures differ from libsecp256k1. Migration requires code changes.\n\n**Is the API stable?**\n\u003e As of v4.0, the C ABI (`ufsecp_*`) and `ct::` signing namespace are stable with SemVer guarantees. The broader C++ API (namespaces `fast::`, experimental modules) is mature for Tier 1 features; breaking changes follow a deprecation cycle. See [docs/ABI_VERSIONING.md](docs/ABI_VERSIONING.md).\n\n**What is the constant-time scope?**\n\u003e All functions in `ct::` namespace are constant-time: field arithmetic, scalar arithmetic, point multiplication, complete addition, signing, and ECDH. The C ABI uses CT internally for all secret-key operations. See [CT Evidence](#ct-evidence--methodology) above.\n\n**Which parts are production-safe today?**\n\u003e Tier 1 features (core ECC, ECDSA, Schnorr, ECDH, stable C ABI) are extensively tested, fuzzed, regression-gated, and run through sanitizer-backed CI with a strong self-audit trail and reproducible evidence.\n\n**How do I reproduce the benchmarks?**\n\u003e See [`docs/BENCHMARKS.md`](docs/BENCHMARKS.md) for exact commands, pinned compiler/driver versions, and raw logs. The [live dashboard](https://shrec.github.io/UltrafastSecp256k1/dev/bench/) tracks performance across commits.\n\n---\n\n## Documentation\n\n| Document | Description |\n|----------|-------------|\n| [API Reference](docs/API_REFERENCE.md) | Full C++ and C ABI reference |\n| [Build Guide](docs/BUILDING.md) | Detailed build instructions for all platforms |\n| [Benchmarks](docs/BENCHMARKS.md) | Complete benchmark results and methodology |\n| [GPU API](include/ufsecp/ufsecp_gpu.h) | GPU C ABI header (28 functions, 8 ops, 3 backends) |\n| [GPU Validation Matrix](docs/GPU_VALIDATION_MATRIX.md) | Per-backend op coverage and validation status |\n| [Feature Maturity](docs/FEATURE_MATURITY.md) | Per-feature GPU/CT/fuzz/tier status table |\n| [Supported Guarantees](include/ufsecp/SUPPORTED_GUARANTEES.md) | ABI stability tiers and commitment levels |\n| [Audit Coverage](docs/AUDIT_COVERAGE.md) | Full audit report with 149 non-exploit modules + 270 exploit PoCs and platform verdicts |\n| [Audit Guide](docs/AUDIT_GUIDE.md) | How to run and interpret audit suite |\n| [Test Matrix](docs/TEST_MATRIX.md) | Comprehensive test coverage map for auditors |\n| [ARM64 Audit \u0026 Benchmark](docs/ARM64_AUDIT_BENCHMARK.md) | ARM64 platform certification and performance analysis |\n| [Threat Model](docs/THREAT_MODEL.md) | Layer-by-layer security risk assessment |\n| [Security Policy](SECURITY.md) | Vulnerability reporting and audit status |\n| [Porting Guide](docs/PORTING.md) | Add new platforms, architectures, GPU backends |\n| [RISC-V Optimizations](docs/RISCV64_BITCOIN_CORE_BENCHMARK.md) | RISC-V assembly details |\n| [ESP32 Setup](docs/ESP32_SETUP.md) | ESP32 embedded development guide |\n| [Examples](examples/README.md) | Multi-language binding examples (C, Python, Rust, Node.js, Go, Java) |\n| [Contributing](CONTRIBUTING.md) | Development guidelines |\n| [Changelog](CHANGELOG.md) | Version history |\n\n---\n\n## Contributing\n\nContributions are welcome! Please read [CONTRIBUTING.md](CONTRIBUTING.md).\n\n```bash\ngit clone https://github.com/shrec/UltrafastSecp256k1.git\ncd UltrafastSecp256k1\ncmake -S . -B build/dev -G Ninja -DCMAKE_BUILD_TYPE=Debug\ncmake --build out/release/dev -j\nctest --test-dir out/release/dev --output-on-failure\n```\n\n---\n\n## License\n\n**MIT License**\n\nThis project is licensed under the MIT License.\nPreviously released versions (up to v3.14.x) were under AGPL-3.0.\nAs of v3.15.0 the license is MIT -- to align with the broader Bitcoin ecosystem\nand remove adoption friction.\n\nSee [LICENSE](LICENSE) for full details.\n\n---\n\n## Contact \u0026 Community\n\n| Channel | Link |\n|---------|------|\n| Issues | [GitHub Issues](https://github.com/shrec/UltrafastSecp256k1/issues) |\n| Discussions | [GitHub Discussions](https://github.com/shrec/UltrafastSecp256k1/discussions) |\n| Wiki | [Documentation Wiki](https://github.com/shrec/UltrafastSecp256k1/wiki) |\n| Benchmarks | [Live Dashboard](https://shrec.github.io/UltrafastSecp256k1/dev/bench/) |\n| Security | [Report Vulnerability](https://github.com/shrec/UltrafastSecp256k1/security/advisories/new) |\n| Commercial | [payysoon@gmail.com](mailto:payysoon@gmail.com) |\n\n---\n\n## Acknowledgements\n\nUltrafastSecp256k1 is an independent implementation -- written from scratch with our own architecture, hybrid GPU execution model, embedded ports, and optimization techniques. The library's core structure and most performance gains came from direct experimentation, profiling, and iteration. At the same time, no project exists in a vacuum. Studying public research and implementation notes from the wider cryptographic community later helped us validate decisions, avoid weaker paths, and uncover additional optimization opportunities.\n\nWe want to acknowledge the teams whose public work informed parts of our journey:\n\n- **[bitcoin-core/secp256k1](https://github.com/bitcoin-core/secp256k1)** -- A major reference point for the ecosystem. UltrafastSecp256k1 was built independently from scratch, but studying their published research later helped us benchmark our own implementations, validate design choices, and extract additional optimization ideas for CPU, GPU, and embedded targets.\n- **[Bitcoin Core](https://github.com/bitcoin/bitcoin)** contributors -- For open specifications (BIP-340 Schnorr, BIP-341 Taproot, RFC 6979) and a correctness-first engineering culture that benefits everyone building in this space.\n- **Pieter Wuille, Jonas Nick, Tim Ruffing** and the libsecp256k1 maintainers -- For publicly sharing research and implementation insights on side-channel resistance, exhaustive testing, field representation trade-offs, and practical optimization techniques. Their published work was valuable to study in the later optimization phase and helped us push our independently built engine further.\n- **[@craigraw](https://github.com/craigraw)** ([Sparrow Wallet](https://sparrowwallet.com)) -- For creating the [bench_bip352](https://github.com/craigraw/bench_bip352) standalone BIP-352 Silent Payments scanning benchmark, which provided an independent, reproducible pipeline comparison between secp256k1 implementations.\n- **Community / GigaChad** -- For running the full CUDA test suite on RTX 5070 Ti (Blackwell), confirming 45/45 tests pass, and identifying the `CMAKE_CUDA_SEPARABLE_COMPILATION` flag required for Blackwell devices. Results in [docs/COMMUNITY_BENCHMARKS.md](docs/COMMUNITY_BENCHMARKS.md).\n\nWe share our optimizations, GPU kernels, embedded ports, and cross-platform techniques freely -- because open-source cryptography grows stronger when knowledge flows in every direction.\n\nSpecial thanks to the [Stacker News](https://stacker.news) and [Delving Bitcoin](https://delvingbitcoin.org) communities for their early support and technical feedback.\n\nExtra gratitude to [@0xbitcoiner](https://stacker.news/0xbitcoiner) for the initial outreach and for helping bridge the project with the wider Bitcoin developer ecosystem.\n\n---\n\n## Support the Project\n\nIf you find **UltrafastSecp256k1** useful, consider supporting its development!\n\n\u003e **We are actively seeking sponsors for a funded bug bounty program, stronger open audit infrastructure, and ongoing development.**\n\u003e See the [Seeking Sponsors](#seeking-sponsors----bug-bounty--development) section above for details.\n\n[![Sponsor](https://img.shields.io/badge/Sponsor_This_Project-GitHub_Spons","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshrec%2Fultrafastsecp256k1","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fshrec%2Fultrafastsecp256k1","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshrec%2Fultrafastsecp256k1/lists"}