{"id":46647459,"url":"https://github.com/compprov/compprov-core","last_synced_at":"2026-04-02T11:53:40.106Z","repository":{"id":341314098,"uuid":"1169653936","full_name":"compprov/compprov-core","owner":"compprov","description":"Core module of the compprov framework - wraps operations into a tracked Calculation Provenance Graph (CPG) context","archived":false,"fork":false,"pushed_at":"2026-03-24T07:11:02.000Z","size":109,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-03-25T07:04:20.195Z","etag":null,"topics":["audit","computational-provenance","dag","framework","graph","java","open-source","parallel-execution","provenance"],"latest_commit_sha":null,"homepage":"https://github.com/compprov","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/compprov.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-01T02:06:52.000Z","updated_at":"2026-03-24T07:11:04.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/compprov/compprov-core","commit_stats":null,"previous_names":["compprov/compprov-core"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/compprov/compprov-core","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/compprov%2Fcompprov-core","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/compprov%2Fcompprov-core/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/compprov%2Fcompprov-core/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/compprov%2Fcompprov-core/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/compprov","download_url":"https://codeload.github.com/compprov/compprov-core/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/compprov%2Fcompprov-core/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31305894,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-02T09:48:21.550Z","status":"ssl_error","status_checked_at":"2026-04-02T09:48:19.196Z","response_time":89,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["audit","computational-provenance","dag","framework","graph","java","open-source","parallel-execution","provenance"],"created_at":"2026-03-08T05:04:32.654Z","updated_at":"2026-04-02T11:53:40.097Z","avatar_url":"https://github.com/compprov.png","language":"Java","readme":"# compprov-core\n\n**compprov** (Computational Provenance) is a Java framework that automatically builds a\n**Calculation Provenance Graph (CPG)** — a DAG that records every variable and every operation\nin a computation as it runs. The result is a complete, machine-readable audit trail of how each\noutput was derived from its inputs.\n\n---\n\n## Contents\n\n- [Core concepts](#core-concepts)\n- [Getting started](#getting-started)\n- [Usage example](#usage-example)\n- [Snapshot: export, replay, and diff](#snapshot-export-replay-and-diff)\n- [Extending with custom type wrappers](#extending-with-custom-type-wrappers)\n- [Built-in types](#built-in-types)\n- [Thread safety](#thread-safety)\n- [Visualization](#visualization)\n- [Examples](#examples)\n- [License](#license)\n\n---\n\n## Core concepts\n\n| Concept | Description |\n|---|---|\n| **CPG** | Directed Acyclic Graph where nodes are variables and edges are data-flow dependencies. Produced automatically during execution. |\n| **Snapshot** | Immutable, point-in-time capture of all variables and operations recorded in a context. Can be serialized to JSON, replayed, or compared. |\n| **Descriptor** | Name + optional metadata (`Meta`) attached to a variable or operation. Used in logs, diff reports, and audit trails. |\n| **VariableWrapper** | Factory that converts a plain value into a provenance-tracked `WrappedVariable` and registers it in the active context. |\n| **ComputationEnvironment** | Shared, thread-safe configuration: registered wrappers, clock, Jackson mapper, descriptor enforcement rules. |\n| **ComputationContext** | Per-computation scope that accumulates the CPG. Not safe to snapshot while mutating. |\n\n---\n\n## Getting started\n\nAdd the dependency to your `pom.xml` (latest version: [![Maven Central](https://img.shields.io/maven-central/v/io.compprov/compprov-core?color=brightgreen)](https://central.sonatype.com/artifact/io.compprov/compprov-core)):\n\n```xml\n\u003cdependency\u003e\n    \u003cgroupId\u003eio.compprov\u003c/groupId\u003e\n    \u003cartifactId\u003ecompprov-core\u003c/artifactId\u003e\n    \u003cversion\u003eVERSION\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\nRequires Java 17+.\n\n---\n\n## Usage example\n\nThe entry point is `DefaultComputationEnvironment` (preconfigured with all built-in wrappers\nand Jackson serializers) and `DefaultComputationContext` (typed convenience wrappers on top\nof the base context).\n\n```java\nimport io.compprov.core.*;\nimport io.compprov.core.meta.Descriptor;\nimport java.math.BigDecimal;\nimport java.math.MathContext;\nimport java.math.RoundingMode;\n\n// --- 1. Create the environment (thread-safe; reuse across computations) ---\nvar env = new DefaultComputationEnvironment();\n\n// --- 2. Create a context for this computation run ---\nvar ctx = new DefaultComputationContext(\n        env,\n        new DataContext(Descriptor.descriptor(\"invoice-calculation\")));\n\n// --- 3. Wrap all inputs ---\n// Every wrapped value gets a unique ID and is recorded in the CPG as an INPUT node.\nvar mc      = ctx.wrapMathContext(new MathContext(10, RoundingMode.HALF_UP),\n                                  Descriptor.descriptor(\"mc\"));\nvar price   = ctx.wrapBigDecimal(new BigDecimal(\"100.00\"),\n                                 Descriptor.descriptor(\"price\"));\nvar taxRate = ctx.wrapBigDecimal(new BigDecimal(\"0.08\"),\n                                 Descriptor.descriptor(\"tax-rate\"));\n\n// --- 4. Perform operations ---\n// Each call records an operation node in the CPG and returns a wrapped result.\n// Pass null as the last argument to let the framework auto-name the result.\nvar tax   = price.multiply(taxRate, mc, Descriptor.descriptor(\"tax\"));\nvar total = price.add(tax, mc, Descriptor.descriptor(\"total\"));\n\n// --- 5. Read the result like any other value ---\nSystem.out.println(total.getValue()); // 108.0000000\n\n// --- 6. Export the full Calculation Provenance Graph ---\nSnapshot snapshot = ctx.snapshot();\nSystem.out.println(env.toJson(snapshot));\nSystem.out.println(env.toHumanReadableLog(snapshot));\n```\n\n### JSON output (abbreviated)\n\n```json\n{\n  \"descriptor\" : { \"name\" : \"invoice-calculation\", \"meta\" : { } },\n  \"variables\" : [\n    { \"track\" : { \"id\" : \"i_1\", \"descriptor\" : { \"name\" : \"mc\" }, ... }, \"value\" : ... },\n    { \"track\" : { \"id\" : \"i_2\", \"descriptor\" : { \"name\" : \"price\" }, ... }, \"value\" : \"100.00\" },\n    { \"track\" : { \"id\" : \"i_3\", \"descriptor\" : { \"name\" : \"tax-rate\" }, ... }, \"value\" : \"0.08\" },\n    { \"track\" : { \"id\" : \"o_4\", \"descriptor\" : { \"name\" : \"tax\" }, ... }, \"value\" : \"8.000000000\" },\n    { \"track\" : { \"id\" : \"o_5\", \"descriptor\" : { \"name\" : \"total\" }, ... }, \"value\" : \"108.0000000\" }\n  ],\n  \"operations\" : [\n    { \"track\" : { \"id\" : \"o_1\", \"descriptor\" : { \"name\" : \"multiply\" }, ... },\n      \"arguments\" : { \"a\" : \"i_2\", \"b\" : \"i_3\", \"mc\" : \"i_1\" },\n      \"resultId\" : \"o_4\" },\n    { \"track\" : { \"id\" : \"o_2\", \"descriptor\" : { \"name\" : \"add\" }, ... },\n      \"arguments\" : { \"a\" : \"i_2\", \"b\" : \"o_4\", \"mc\" : \"i_1\" },\n      \"resultId\" : \"o_5\" }\n  ]\n}\n```\n\nVariable IDs use the prefix `i_` for inputs and `o_` for outputs, followed by a sequential\nnumeric counter that is stable within a single context run.\n\n---\n\n## Snapshot: export, replay, and diff\n\n### Serialize and deserialize\n\n```java\nString json = env.toJson(ctx.snapshot());\n\n// Deserialize back to a Snapshot\nSnapshot restored = env.fromJson(json);\n```\n\n### Replay a computation\n\n`env.compute()` replays all recorded operations against the given snapshot,\nproducing a new context with freshly computed outputs:\n\n```java\nvar replayed = env.compute(restored);\nBigDecimal replayedTotal = (BigDecimal) replayed.getVariable(\"o_5\").getValue();\n```\n\n### Change an input and propagate\n\nUse `copyWith` to substitute one or more input values, then replay:\n\n```java\nSnapshot modified = env.copyWith(\n        restored,\n        Descriptor.descriptor(\"invoice-calculation-v2\"),\n        Map.of(\"i_3\", new ValueWithDescriptor(\n                Descriptor.descriptor(\"tax-rate\"),\n                new BigDecimal(\"0.10\"))));  // 10% tax instead of 8%\n\nvar updated = env.compute(modified);\n// updated.getVariable(\"o_5\") now reflects the new total\n```\n\n---\n\n## Extending with custom type wrappers\n\nAdding support for a type not built into the framework requires three things:\n\n1. A `Wrapped\u003cType\u003e` class that defines the tracked operations for your type.\n2. A `VariableWrapper\u003cType\u003e` factory that instantiates it.\n3. Registering the factory with the environment.\n\n### Step 1 — Create the wrapped class\n\n```java\nimport io.compprov.core.ComputationContext;\nimport io.compprov.core.meta.Descriptor;\nimport io.compprov.core.variable.AbstractWrappedVariable;\nimport io.compprov.core.variable.VariableTrack;\n\nimport java.util.*;\nimport java.util.function.Function;\n\npublic final class WrappedLong extends AbstractWrappedVariable\u003cLong\u003e {\n\n    // Define one Descriptor constant per operation.\n    private static final Descriptor OP_ADD      = Descriptor.descriptor(\"add\");\n    private static final Descriptor OP_MULTIPLY = Descriptor.descriptor(\"multiply\");\n\n    // Map each Descriptor to a lambda that performs the actual computation.\n    private static final Map\u003cDescriptor, Function\u003cList\u003cObject\u003e, Object\u003e\u003e FUNCTIONS;\n\n    static {\n        Map\u003cDescriptor, Function\u003cList\u003cObject\u003e, Object\u003e\u003e m = new HashMap\u003c\u003e();\n        m.put(OP_ADD,      args -\u003e (Long) args.get(0) + (Long) args.get(1));\n        m.put(OP_MULTIPLY, args -\u003e (Long) args.get(0) * (Long) args.get(1));\n        FUNCTIONS = Collections.unmodifiableMap(m);\n    }\n\n    public WrappedLong(ComputationContext context, VariableTrack track, Long value) {\n        super(context, track, value);\n    }\n\n    @Override\n    public Function\u003cList\u003cObject\u003e, Object\u003e getFunction(Descriptor operationDescriptor) {\n        return FUNCTIONS.get(operationDescriptor);\n    }\n\n    // --- Public API ---\n    // Each operation comes in two overloads: with and without a result Descriptor.\n\n    public WrappedLong add(WrappedLong augend, Descriptor resultDescriptor) {\n        return (WrappedLong) execute(OP_ADD, \"a\", this, \"b\", augend, resultDescriptor);\n    }\n\n    public WrappedLong add(WrappedLong augend) {\n        return add(augend, null);\n    }\n\n    public WrappedLong multiply(WrappedLong multiplicand, Descriptor resultDescriptor) {\n        return (WrappedLong) execute(OP_MULTIPLY, \"a\", this, \"b\", multiplicand, resultDescriptor);\n    }\n\n    public WrappedLong multiply(WrappedLong multiplicand) {\n        return multiply(multiplicand, null);\n    }\n}\n```\n\n### Step 2 — Create the factory\n\n```java\nimport io.compprov.core.ComputationContext;\nimport io.compprov.core.variable.VariableTrack;\nimport io.compprov.core.variable.VariableWrapper;\nimport io.compprov.core.variable.WrappedVariable;\n\npublic final class LongWrapperFactory implements VariableWrapper\u003cLong\u003e {\n    @Override\n    public WrappedVariable wrap(ComputationContext context, VariableTrack track, Long value) {\n        return new WrappedLong(context, track, value);\n    }\n}\n```\n\n### Step 3 — Register and use\n\n```java\nvar env = new DefaultComputationEnvironment();\nenv.registerWrapper(Long.class, new LongWrapperFactory());\n\nvar ctx = new DefaultComputationContext(env,\n        new DataContext(Descriptor.descriptor(\"my-computation\")));\n\n// Use the base wrap() method — DefaultComputationContext does not have a wrapLong() helper.\n// Cast to your concrete type after wrapping.\nvar a = (WrappedLong) ctx.wrap(100L, Descriptor.descriptor(\"a\"));\nvar b = (WrappedLong) ctx.wrap(42L,  Descriptor.descriptor(\"b\"));\nvar sum = a.add(b, Descriptor.descriptor(\"sum\"));\n```\n\nIf you use a custom type frequently, extend `DefaultComputationContext` to add a typed\n`wrapLong()` convenience method, the same way `DefaultComputationContext` does for\n`wrapBigDecimal`, `wrapBigInteger`, etc.\n\n### Type deserialization\n\nIf you need to serialize and deserialize snapshots containing your custom type, register a\nJackson deserializer with the `ObjectMapper` inside your custom `ComputationEnvironment`.\nSee `DefaultComputationEnvironment` for examples using `ZonedDateTimeSerializer`,\n`MathContextDeserializer`, and `VariableDeserializer`.\n\n---\n\n## Built-in types\n\n`DefaultComputationEnvironment` registers the following wrappers out of the box:\n\n| Java type | Wrapped class | Notes |\n|---|---|---|\n| `BigDecimal` | `WrappedBigDecimal` | Full arithmetic: add, subtract, multiply, divide, pow, sqrt, abs, negate, remainder, max, min, and more |\n| `BigInteger` | `WrappedBigInteger` | Full arithmetic including modPow (ternary) |\n| `Integer` | `WrappedInteger` | Parameter-only type; used as an argument to `pow`, `scaleByPowerOfTen`, etc. |\n| `Long` | `WrappedLong` | Parameter-only type |\n| `MathContext` | `WrappedMathContext` | Carries precision and rounding mode; passed to most `BigDecimal` / `BigInteger` ops |\n\n---\n\n## Thread safety\n\n`ComputationEnvironment` and its wrappers map are fully thread-safe — a single instance can\nbe shared across threads and computations.\n\n`ComputationContext` is thread-safe for all wrap and executeOperation calls. The `snapshot()`\nmethod is **not** safe to call while other threads are still recording operations into the same\ncontext.\n\n---\n\n## Visualization\n\nTo visualize your CPG data use [compprov-render](https://github.com/compprov/compprov-render) —\na set of HTML pages that run locally in your web browser, no server required.\n\nSimply export a snapshot to JSON and open the page:\n\n```java\nString json = env.toJson(ctx.snapshot());\n// save to a file, then open graph.html or plot.html in your browser\n```\n\n### Graph view — provenance graph\n\nRenders the full CPG as an interactive node-edge graph.\nVariables are shown as typed nodes (input / output), operations as diamond nodes with labeled argument edges.\n\n![Provenance Graph](https://raw.githubusercontent.com/compprov/compprov-render/master/screenshots/graph.png)\n\n### Plot view — multi-dataset comparison\n\nPlots numeric variable values across one or more datasets side-by-side.\nSupports points, line, and table views with configurable X-axis labels.\n\n![Plot View](https://raw.githubusercontent.com/compprov/compprov-render/master/screenshots/plot.png)\n\n---\n\n## Examples\n\nThe `io.compprov.examples` package contains three self-contained examples that each demonstrate\na different aspect of the framework.\n\n| Example | Package | Domain | Key technique |\n|---|---|---|---|\n| [Net Asset Value (NAV)](#net-asset-value-nav) | `io.compprov.examples.nav` | Crypto-portfolio accounting | Custom domain type wrappers |\n| [Gauge Block Calibration](#gauge-block-calibration) | `io.compprov.examples.gaugeblock` | Precision length metrology | Pure `BigDecimal` scalar formula chain |\n| [Hydrological Model Evaluation](#hydrological-model-evaluation) | `io.compprov.examples.hydrology` | River discharge modelling | List-based tracked operations |\n\n---\n\n### Net Asset Value (NAV)\n\n**`io.compprov.examples.nav`** · `NetAssetValueCalculator.calculate()`\n\nComputes the total USD value of a multi-asset crypto portfolio (BTC, ETH, USDC positions held\nacross Binance, staking, and Morpho DeFi) by converting each position to USD at a spot rate\nand summing the results.\n\nThe primary focus is showing **how to wrap custom domain types**. The domain model uses `Amount`\nand `Rate` objects rather than raw `BigDecimal`, and the example integrates them with the\nframework without modifying them — using the three-step pattern:\n\n1. **`WrappedAmount` / `WrappedRate`** extend `AbstractWrappedVariable\u003cT\u003e` and declare their\n   operations (`add`, `convert`, `addBulk`) as `Descriptor` constants mapped to computation lambdas.\n2. **`AmountWrapper` / `RateWrapper`** implement `VariableWrapper\u003cT\u003e` — the one-method factory\n   the framework calls to instantiate tracked variables.\n3. **`NavComputationContext`** extends `DefaultComputationContext`, registers both wrappers with\n   the shared `ComputationEnvironment`, and exposes typed `wrap(Amount, ...)` / `wrap(Rate, ...)`\n   convenience overloads.\n\nAfter the calculation the snapshot is serialized to JSON, then deserialized and replayed via\n`NavComputationContext.environment.compute()` — verifying that the CPG is round-trip stable and the\nreplayed output matches the original result.\n\n---\n\n### Gauge Block Calibration\n\n**`io.compprov.examples.gaugeblock`** · `GaugeBlockCalibration.calibrate()`\n\nReproduces the interferometric calibration of a 7 mm tungsten carbide gauge block (NRC 91A)\nfrom the following paper, which uses this measurement as a demonstration of metrological\nprovenance management:\n\n\u003e Ryan M. White, *Provenance in the Context of Metrological Traceability*, Metrology 2025, 5(3), 52.\n\u003e DOI: [10.3390/metrology5030052](https://doi.org/10.3390/metrology5030052)\n\nThe computation chain has three stages, all tracked in the CPG:\n\n1. **Refractive index** — the Birch–Downs modified Ciddor equation (8 tracked steps) converts\n   air temperature, pressure, relative humidity, CO₂ concentration, and saturation vapor\n   pressure into the refractive index *n* of the measurement medium.\n2. **Interferometric length** — the HeNe laser vacuum wavelength (632.99 nm) divided by *n*\n   gives the air wavelength; the observed fringe order `m + f` gives the raw length\n   `L_raw = (m + f) × λ_air / 2`.\n3. **Thermal correction** — the raw length is corrected to the ISO 1 reference temperature\n   (20 °C) using the tungsten carbide expansion coefficient α = 4.23 × 10⁻⁶ K⁻¹ from\n   the paper: `L_cal = L_raw / (1 + α × ΔT)`.\n\nThe deviation from the 7 mm nominal length is asserted to round to **+2 nm**, matching the\npaper's reported result (expanded uncertainty U = 31 nm, k = 2).\n\nThis example uses only built-in `WrappedBigDecimal` arithmetic — no custom wrappers needed —\nshowing that the framework handles complex pure-scalar formula chains out of the box.\n\n---\n\n### Hydrological Model Evaluation\n\n**`io.compprov.examples.hydrology`** · `MhmDischargeEvaluation.evaluateParameterSetP1()`\n\nEvaluates the mesoscale Hydrologic Model (mHM) output against observed river discharge at the\nMoselle River basin upstream of Perl (~11 500 km², Luxembourg/Germany), as described in:\n\n\u003e Villamar et al., *Archivist: a metadata management tool for facilitating FAIR research*,\n\u003e Scientific Data, 2025.\n\u003e DOI: [10.1038/s41597-025-04521-6](https://doi.org/10.1038/s41597-025-04521-6)\n\nThe metric is the **Kling-Gupta Efficiency (KGE)** (Gupta et al. 2009):\n\n```\nKGE = 1 − √[ (r−1)² + (α−1)² + (β−1)² ]\n\n  r = Pearson correlation = Σ(devObs · devSim) / √(Σ devObs² · Σ devSim²)\n  α = variability ratio  = σ_sim / σ_obs\n  β = bias ratio         = μ_sim / μ_obs\n```\n\nKGE = 1 is perfect; values below 0 indicate the model is worse than the observed mean as a\npredictor. The paper reports that parameter set P₁ outperforms P₂ with scores mostly above 0.5.\n\nThe computation uses `ArrayList\u003cWrappedBigDecimal\u003e` with loops and `addBulk`, demonstrating the\npattern for **list-based tracked operations** where the number of time steps is dynamic.\nThe 8-step chain (means → deviations → squared deviations and cross products → sums → r → α\n→ β → KGE) is fully recorded in the CPG, with every intermediate quantity named and traceable.\nThe synthetic dataset is engineered so that r = 1, β = 1, α = 0.9, giving **KGE = 0.9 exactly**,\nverified by exact `BigDecimal` equality.\n\n---\n\n## License\n\nApache License 2.0 — see [LICENSE](LICENSE).\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcompprov%2Fcompprov-core","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcompprov%2Fcompprov-core","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcompprov%2Fcompprov-core/lists"}