{"id":50718744,"url":"https://github.com/ianchute/shapley-attribution","last_synced_at":"2026-06-09T21:31:00.506Z","repository":{"id":69598234,"uuid":"225305843","full_name":"ianchute/shapley-attribution","owner":"ianchute","description":"Shapley Attribution: A scikit-learn compatible library that uses Shapley values to measure each marketing channel's true marginal contribution to conversions","archived":false,"fork":false,"pushed_at":"2026-04-20T23:58:37.000Z","size":1694,"stargazers_count":41,"open_issues_count":0,"forks_count":13,"subscribers_count":2,"default_branch":"master","last_synced_at":"2026-05-24T09:06:39.745Z","etag":null,"topics":["attribution","marketing-attribution","shapley-values"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ianchute.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2019-12-02T06:53:28.000Z","updated_at":"2026-04-20T23:58:41.000Z","dependencies_parsed_at":null,"dependency_job_id":"5e5b8d52-edc0-4168-a447-7e1c1ac8ef82","html_url":"https://github.com/ianchute/shapley-attribution","commit_stats":null,"previous_names":["ianchute/shapley-attribution-model-zhao-naive"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/ianchute/shapley-attribution","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ianchute%2Fshapley-attribution","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ianchute%2Fshapley-attribution/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ianchute%2Fshapley-attribution/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ianchute%2Fshapley-attribution/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ianchute","download_url":"https://codeload.github.com/ianchute/shapley-attribution/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ianchute%2Fshapley-attribution/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34127342,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-09T02:00:06.510Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["attribution","marketing-attribution","shapley-values"],"created_at":"2026-06-09T21:31:00.441Z","updated_at":"2026-06-09T21:31:00.497Z","avatar_url":"https://github.com/ianchute.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Shapley Attribution\n\n```bash\npip install shapley-attribution\n```\n\nA scikit-learn compatible Python library for multi-touch attribution modeling using Shapley values from game theory. Computes marginal contribution of each marketing channel to conversion, inspired by Zhao et al. (2018) but using an interventional Shapley approach with a learned conversion model.\n\n## Installation\n\n```bash\npip install shapley-attribution\n\n# With ONNX export/import support\npip install \"shapley-attribution[onnx]\"\n\n# With benchmark dependencies\npip install \"shapley-attribution[benchmarks]\"\n\n# Development install from source\npip install -e \".[dev]\"\n```\n\n## Quick Start\n\n```python\nfrom shapley_attribution import (\n    MonteCarloShapleyAttribution,\n    PathShapleyAttribution,\n    LinearAttribution,\n    make_attribution_problem,\n    attribution_summary,\n)\n\n# Standard dataset (set-based ground truth)\njourneys, conversions, ground_truth, channels = make_attribution_problem(\n    n_channels=8, n_journeys=5000, random_state=42\n)\n\n# MC Shapley — order-agnostic, strict Shapley axioms\nmc = MonteCarloShapleyAttribution(n_iter=2000, random_state=42)\nmc.fit(journeys, y=conversions)\n\n# Path Shapley — ordering-aware, uses actual journey sequences\npath = PathShapleyAttribution(random_state=42)\npath.fit(journeys, y=conversions)\n\n# Get aggregate attribution scores\nscores = mc.get_attribution()        # dict: channel -\u003e score\narray  = mc.get_attribution_array()  # numpy array aligned with model.channels_\n\n# Per-journey attribution matrix\nmatrix = mc.transform(journeys)      # shape (n_journeys, n_channels)\n\n# Compare multiple models against ground truth\nlinear = LinearAttribution().fit(journeys, y=conversions)\nresults = attribution_summary(\n    {\n        \"MC Shapley\"  : mc.get_attribution_array(),\n        \"Path Shapley\": path.get_attribution_array(),\n        \"Linear\"      : linear.get_attribution_array(),\n    },\n    ground_truth\n)\n\n# Dataset with directed (ordered) interactions — use Path Shapley here\njourneys_d, conv_d, gt_d, ch_d, ordered_gt = make_attribution_problem(\n    n_channels=8, n_journeys=5000,\n    directed_interaction_strength=0.5,\n    return_ordered_ground_truth=True,\n    random_state=42,\n)\npath_d = PathShapleyAttribution(random_state=42).fit(journeys_d, y=conv_d)\n\n# Visualize position-level credit (upper-funnel vs lower-funnel)\npath_d.plot_position_attribution()\n```\n\n## Models\n\n### Shapley-based\n\n| Model | Class | Complexity | Best for |\n|---|---|---|---|\n| **Simplified Shapley** | `SimplifiedShapleyAttribution` | O(n_journeys x n_channels) | Exact values, \u003c= 20 channels |\n| **Ordered Shapley** | `OrderedShapleyAttribution` | O(n_journeys x n_positions x 2^n_channels) | Position-aware, \u003c= 15 channels |\n| **Monte Carlo Shapley** | `MonteCarloShapleyAttribution` | O(n_iter x n_channels) | Scalable to 100+ channels, order-agnostic |\n| **Path Shapley** | `PathShapleyAttribution` | O(n_journeys x max_journey_length) | Ordering-aware, typically faster than MC |\n\n**MC Shapley** trains a GradientBoostingClassifier to learn conversion probability, then estimates Shapley values by averaging marginal contributions across random permutations (interventional formulation — no background averaging). It is order-agnostic: `[A, B]` and `[B, A]` receive identical attribution.\n\n**Path Shapley** uses the same GBM value function but replaces random permutation sampling with the *actual journey sequence* as the coalition-formation order. Channel B in journey `[A, B]` is credited for the lift *given A was already seen*; in `[B, A]`, A receives that conditional credit instead. This makes the model sensitive to channel ordering, which is meaningful when upstream channels prime the customer for downstream ones. Use `directed_interaction_strength \u003e 0` in `make_attribution_problem` to generate data that rewards this sensitivity.\n\n### PathShapleyAttribution — Deep Dive\n\n**How it works.** For each converting journey `[c₁, c₂, ..., cₙ]`, Path Shapley builds the coalition incrementally along the actual touchpoint sequence:\n\n```\ncontribution(cᵢ) = v({c₁,...,cᵢ}) − v({c₁,...,cᵢ₋₁})\n```\n\nwhere `v(S)` is the GBM's predicted conversion probability for the channel set S. Duplicate channels within a journey are collapsed to first occurrence before scoring, so the ordering is the order of *first* exposure.\n\nThe key insight: because the GBM value function is nonlinear, `v({A})` and `v({B})` are generally different. So the marginal contribution of B *given A was already seen* (`v({A,B}) − v({A})`) differs from its contribution in the reverse order (`v({A,B}) − v({B})`). This is where the ordering sensitivity comes from — it does not require any modification to the GBM itself.\n\n**When to use Path Shapley vs MC Shapley.**\n\nPath Shapley is the right choice when channel ordering genuinely matters — for example, when awareness channels (display, social) consistently appear before conversion channels (search, email) and the sequence itself creates synergy. It is also faster than MC Shapley at inference time because it requires only one value-function call per position (O(n_journeys × max_journey_length)) vs. O(n_iter × n_channels) for MC.\n\nMC Shapley is better when you want pure set-based Shapley values that are agnostic to sequencing, or when journeys are so short that ordering effects are negligible. MC Shapley also satisfies the strict Shapley axioms (efficiency, symmetry, null player, additivity); Path Shapley does not — it is a path-dependent approximation, not a true Shapley value.\n\n**Limitations.**\n- Not a strict Shapley value: Path Shapley violates the symmetry axiom — two channels with identical marginal contributions can receive different scores if they typically appear in different positions.\n- Rare channels are noisy: channels that appear infrequently have few paths to average over, making their scores higher-variance than MC's permutation average.\n- Journey deduplication: repeated channel exposures within a single journey are collapsed to first occurrence, discarding frequency information.\n- The GBM value function is still set-based: it cannot directly represent interaction effects that depend on order, only on channel co-presence. Ordering effects enter through the path accumulation, not the value function.\n\n**Performance.** Evaluated on synthetic data (8 channels, 5000 journeys). See the [Benchmark](#benchmark) section for full results. Summary:\n\n| Setting | Path NMAE | MC NMAE | Path wins? |\n|---|---|---|---|\n| Undirected (set-based GT) | **0.0195** | 0.0221 | ✓ lower NMAE |\n| Directed (ordered GT, strength=0.6) | **0.0399** | 0.0503 | ✓ lower NMAE |\n\nOn undirected data, Path Shapley's lower NMAE reflects that real journeys carry ordering signal even in the absence of explicit directed coefficients — the path accumulation produces a useful inductive bias. On directed data with asymmetric channel interactions, the gap widens: MC Shapley is blind to the ordering and its NMAE actually exceeds the heuristic baselines, while Path Shapley remains the best model.\n\n### Heuristic Baselines\n\n| Model | Class | Rule |\n|---|---|---|\n| **First Touch** | `FirstTouchAttribution` | 100% to first channel |\n| **Last Touch** | `LastTouchAttribution` | 100% to last channel |\n| **Linear** | `LinearAttribution` | Equal credit across all touchpoints |\n| **Time Decay** | `TimeDecayAttribution` | Exponential decay favoring recent touchpoints |\n| **Position Based** | `PositionBasedAttribution` | 40/20/40 split (first/middle/last) |\n\n## scikit-learn Compatibility\n\nAll models inherit from `sklearn.base.BaseEstimator` and `TransformerMixin`:\n\n```python\nfrom sklearn.base import clone\n\nmodel = MonteCarloShapleyAttribution(n_iter=2000)\nmodel.get_params()          # {'n_iter': 2000, 'random_state': None, ...}\nmodel.set_params(n_iter=500)\ncloned = clone(model)       # Deep copy with same params\n```\n\n## ONNX Serialization\n\nAll fitted models can be saved to and loaded from ONNX format, enabling deployment via any ONNX-compatible runtime.\n\n### Installation\n\n```bash\npip install -e \".[onnx]\"\n# or manually:\npip install onnx skl2onnx onnxruntime\n```\n\n### Save and load\n\n```python\nfrom shapley_attribution import MonteCarloShapleyAttribution, save_onnx, load_onnx\n\n# Fit a model\nmc = MonteCarloShapleyAttribution(n_iter=2000, random_state=42)\nmc.fit(journeys, y=conversions)\n\n# Save — available as standalone function or instance method\nsave_onnx(mc, \"mc_shapley.onnx\")\nmc.save_onnx(\"mc_shapley.onnx\")   # equivalent\n\n# Load — returns a fully usable fitted model\nfrom shapley_attribution.base import BaseAttributionModel\nloaded = load_onnx(\"mc_shapley.onnx\")\nloaded = BaseAttributionModel.load_onnx(\"mc_shapley.onnx\")  # equivalent\n\n# Full API is available immediately after loading\nscores  = loaded.get_attribution()        # dict: channel → score\narray   = loaded.get_attribution_array()  # numpy array\nmatrix  = loaded.transform(new_journeys)  # per-journey attribution\n```\n\n### What is serialized\n\n| Model type | ONNX graph | Metadata |\n|---|---|---|\n| `MonteCarloShapleyAttribution` | Learned GBM (GradientBoostingClassifier → ONNX via skl2onnx) | channels, attribution scores, hyperparams |\n| `PathShapleyAttribution` | Learned GBM | channels, attribution scores, position_attribution_, hyperparams |\n| `SimplifiedShapleyAttribution` | Identity op (trivial) | channels, attribution scores, hyperparams |\n| `OrderedShapleyAttribution` | Identity op (trivial) | channels, attribution scores, position_attribution_, hyperparams |\n| Heuristic baselines | Identity op (trivial) | channels, attribution scores, hyperparams |\n\nFor GBM-backed models the ONNX file embeds the full value function graph, so you can query the conversion probability model directly via `onnxruntime`:\n\n```python\nimport onnxruntime as rt\nimport numpy as np\n\nsess = rt.InferenceSession(\"mc_shapley.onnx\")\n# binary presence mask — shape (1, n_channels)\nmask = np.array([[1, 0, 1, 0, 0, 0, 1, 0]], dtype=np.float32)\nlabel, proba = sess.run(None, {\"input\": mask})\np_conversion = proba[0, 1]  # P(conversion | channels 0, 2, 6 present)\n```\n\nThis makes it straightforward to deploy the value function to any ONNX-compatible serving stack (ONNX Runtime Server, Triton, MLIR, etc.) without a Python sklearn dependency.\n\n## Conversion Labels\n\nAll models accept an optional `y` parameter with binary conversion labels:\n\n```python\nmodel.fit(journeys, y=conversions)  # 1=converted, 0=not\nmodel.fit(journeys)                 # Legacy mode: all journeys assumed converting\n```\n\nMC Shapley uses these labels to train a conversion model, giving it a significant accuracy advantage over heuristic baselines. The heuristic baselines use the labels to attribute credit only to converting journeys.\n\n## Evaluation Metrics\n\n```python\nfrom shapley_attribution.metrics import (\n    normalized_mean_absolute_error,  # NMAE in [0, 2], lower is better\n    rank_correlation,                # Spearman rho in [-1, 1], higher is better\n    top_k_overlap,                   # Fraction of true top-k recovered\n    attribution_summary,             # Compare multiple models at once\n)\n```\n\n## Synthetic Data\n\n```python\nfrom shapley_attribution.datasets import make_attribution_problem\n\njourneys, conversions, ground_truth, channels = make_attribution_problem(\n    n_channels=10,\n    n_journeys=5000,\n    max_journey_length=8,\n    interaction_effects=0.5,   # Pairwise synergy between channels\n    base_conversion_rate=0.3,\n    random_state=42,\n)\n```\n\nThe synthetic generator creates journeys with roughly uniform channel sampling but conversion probability driven by channel presence and pairwise interactions via a logistic model with known coefficients. This rewards models that capture marginal contribution (Shapley) over frequency counting (heuristics).\n\nEnable `directed_interaction_strength \u003e 0` to additionally bake in **ordering effects**: an asymmetric bonus/penalty is applied when channel i appears *before* channel j (`directed_matrix[i, j] ≠ directed_matrix[j, i]`). This creates genuine sequential synergies that PathShapleyAttribution is designed to exploit, while set-based models (MC Shapley, heuristics) remain blind to them.\n\n```python\njourneys, conversions, ground_truth, channels, ordered_ground_truth = make_attribution_problem(\n    n_channels=8,\n    n_journeys=5000,\n    directed_interaction_strength=0.5,   # ordering effects active\n    return_ordered_ground_truth=True,    # oracle path GT via true model\n)\n```\n\n`ordered_ground_truth` is computed by walking each converting journey through the true logistic model and accumulating marginal contributions along the path — the oracle that PathShapleyAttribution aims to recover.\n\n### How ground truth is computed\n\nGround truth is determined entirely before any journeys are generated:\n\n1. **Channel importances** are sampled from a Dirichlet distribution (α=2) and raised to the power 1.3, producing moderate skew — one or two channels are notably more important than others, but not overwhelmingly so.\n2. **Conversion coefficients** are scaled from those importances: `coef[ch] = importance[ch] × n_channels × 3`.\n3. **Pairwise interaction terms** are sampled for every channel pair, adding synergy effects that no heuristic can capture.\n4. **An intercept is calibrated** via root-finding (`scipy.brentq`) so the overall conversion rate matches `base_conversion_rate`.\n5. **Journeys are generated** with near-uniform channel sampling, so channel frequency carries almost no signal — only which combinations appear drives conversion.\n6. **Conversion labels** are drawn as Bernoulli samples from the logistic model: `P(convert) = σ(intercept + Σcoef·presence + Σinteraction·pair_presence)`.\n\nThe ground truth returned is the **normalized channel importance vector** from step 1 — the \"true\" individual channel weights, independent of interactions. This is intentionally slightly different from perfect Shapley values (which would distribute interaction credit across channels), giving a realistic evaluation target.\n\n## Visualization\n\nAll fitted models expose three plot methods directly. Standalone functions are also available for multi-model comparison.\n\n```python\nfrom shapley_attribution import (\n    MonteCarloShapleyAttribution, LinearAttribution,\n    make_attribution_problem,\n    compare_models, plot_performance, plot_journey, plot_journeys_heatmap,\n)\n\njourneys, conversions, ground_truth, channels = make_attribution_problem(\n    n_channels=8, n_journeys=5000, random_state=42\n)\n\nmc = MonteCarloShapleyAttribution(n_iter=2000, random_state=42).fit(journeys, y=conversions)\nlin = LinearAttribution().fit(journeys, y=conversions)\n```\n\n### Per-model attribution bar chart\n\n```python\n# On the model directly — overlays ground truth markers\nmc.plot_attribution(ground_truth=ground_truth, top_k=8)\n```\n\nShows a horizontal bar chart sorted by attribution score. Pass `ground_truth` to overlay ◆ markers for easy comparison.\n\n### Compare multiple models\n\n```python\n# Standalone function — accepts 2+ models or pre-computed arrays\ncompare_models(\n    {\"MC Shapley\": mc, \"Linear\": lin},\n    ground_truth=ground_truth,\n)\n```\n\nRenders a grouped bar chart with one cluster per channel. Accepts either fitted model objects or raw numpy arrays.\n\n### Performance metrics panel\n\n```python\nfrom shapley_attribution.metrics import attribution_summary\n\nresults = attribution_summary(\n    {\"MC Shapley\": mc.get_attribution_array(),\n     \"Linear\": lin.get_attribution_array()},\n    ground_truth,\n)\nplot_performance(results)\n```\n\nThree-panel figure (NMAE / Spearman rank correlation / top-3 overlap). The best-performing model is highlighted in each panel.\n\n### Journey sequence diagram\n\n```python\n# On the model directly — boxes are coloured by attribution weight\nmc.plot_journey(journeys[0], converted=bool(conversions[0]))\n\n# Standalone — without a model (plain sequence)\nplot_journey(journeys[0], converted=True)\n```\n\nRenders touchpoints as rounded boxes connected by arrows, with a conversion outcome node at the end. When called on a fitted model, box colour encodes that model's attribution score for each channel.\n\n### Position attribution breakdown (PathShapley only)\n\n```python\npath = PathShapleyAttribution(random_state=42).fit(journeys, y=conversions)\n\n# On the model directly\npath.plot_position_attribution()\n\n# Standalone\nfrom shapley_attribution import plot_position_attribution\nplot_position_attribution(path, top_k=6)\n```\n\nStacked bar chart where each bar is a channel and each stack segment is a journey position (position 1 = first touchpoint, position 2 = second, …). Channels with tall early-position stacks are **upper-funnel** (awareness); channels with tall late-position stacks are **lower-funnel** (conversion drivers). Only available after fitting a `PathShapleyAttribution` model.\n\n### Per-journey attribution heatmap\n\n```python\n# On the model directly\nmc.plot_journeys_heatmap(journeys, conversions=conversions, max_journeys=50)\n\n# Standalone\nplot_journeys_heatmap(mc, journeys, conversions=conversions)\n```\n\nHeatmap of the `transform()` output matrix. Each row is a journey, each column is a channel. Pass `conversions` to show only converting journeys.\n\n---\n\n## Benchmark\n\n```bash\npython benchmarks/benchmark.py\n```\n\nResults use 8 channels, 5000 journeys, 2000 MC iterations, `random_state=42`. Lower NMAE is better; higher rank correlation and top-3 overlap are better.\n\n### Benchmark A — Undirected (standard set-based ground truth)\n\nNo directed interaction effects. Ground truth is the normalized Dirichlet channel importance vector.\n\n```\nModel                    NMAE    Rank Corr    Top-3    Time(s)\n--------------------------------------------------------------\nFirst Touch            0.0481       0.7619     0.67     0.004\nLast Touch             0.0446       0.9701     1.00     0.003\nLinear                 0.0457       0.9524     1.00     0.005\nTime Decay             0.0445       0.9762     1.00     0.006\nPosition Based         0.0461       0.9762     0.67     0.005\nSimplified Shapley     0.0455       1.0000     1.00     0.006\nMC Shapley             0.0221       0.9762     1.00     0.496\nPath Shapley           0.0195       0.9762     0.67     0.471  ← best NMAE\n```\n\nPath Shapley achieves the lowest NMAE even on undirected data — the path accumulation provides a useful inductive bias that captures ordering effects present in real journeys even without explicit directed coefficients. MC Shapley has the edge on top-3 recovery (1.00 vs 0.67), which reflects that Path Shapley's asymmetric treatment of channels can slightly misrank channels that happen to be consistently upstream.\n\n### Benchmark B — Directed (ordered ground truth, `directed_interaction_strength=0.6`)\n\nAsymmetric channel interactions baked into the data generator. Ground truth is the oracle path-based marginal contribution computed by walking converting journeys through the true logistic model. This is the evaluation target that PathShapley is designed to recover.\n\n```\nModel                    NMAE    Rank Corr    Top-3    Time(s)\n--------------------------------------------------------------\nFirst Touch            0.0474       0.2619     0.33     0.004\nLast Touch             0.0404       0.7306     0.67     0.003\nLinear                 0.0427       0.7143     0.67     0.005\nTime Decay             0.0407       0.6905     0.67     0.006\nPosition Based         0.0429       0.6667     0.67     0.005\nSimplified Shapley     0.0423       0.7619     0.67     0.006\nMC Shapley             0.0503       0.5952     0.67     0.522  ← worst NMAE\nPath Shapley           0.0399       0.6667     0.67     0.459  ← best NMAE\n```\n\nWith directed interactions active, MC Shapley is blind to channel ordering and performs worse than all heuristic baselines on NMAE. Path Shapley remains the best model, and its advantage over the next-best baseline (Last Touch, 0.0404) is consistent across both NMAE and timing. The relatively modest rank correlation improvement reflects that six-channel ordering effects are still partially captured by heuristics through position biases, but the NMAE gap is clear.\n\n**Rule of thumb:** use Path Shapley when you suspect that upper-funnel channels (display, social, video) systematically prime customers for lower-funnel conversions (search, email, retargeting), and you want that sequential credit reflected in the attribution. Use MC Shapley when channel ordering is noise and you want strict Shapley-axiom compliance.\n\n## Tests\n\n```bash\npytest tests/ -v\n```\n\nTests covering sklearn API compliance, attribution correctness, MC convergence, PathShapley ordering sensitivity, directed data generation, input validation, the synthetic dataset generator, and ONNX round-trip serialization for all model types.\n\n## Project Structure\n\n```\nshapley_attribution/\n├── __init__.py                   # Public API\n├── base.py                       # BaseAttributionModel (sklearn mixin + plot + ONNX methods)\n├── onnx.py                       # save_onnx() / load_onnx() — ONNX serialization\n├── models/\n│   ├── simplified.py             # Exact set-based Shapley\n│   ├── ordered.py                # Exact position-aware Shapley\n│   ├── monte_carlo.py            # Approximate Shapley (GBM + MC permutation sampling)\n│   └── path_shapley.py           # Path Shapley (GBM + actual journey sequence)\n├── baselines/\n│   └── heuristic.py              # First/Last Touch, Linear, Time Decay, Position\n├── datasets/\n│   └── synthetic.py              # make_attribution_problem() (+ directed interactions)\n├── metrics/\n│   └── evaluation.py             # NMAE, rank correlation, top-k overlap\n└── visualization/\n    └── plots.py                  # plot_attribution, compare_models, plot_performance,\n                                  # plot_journey, plot_journeys_heatmap,\n                                  # plot_position_attribution\n```\n\n## Roadmap\n\n- [x] ONNX export/import (`save_onnx` / `load_onnx`) with onnxruntime inference support\n- [ ] GPU acceleration (PyTorch/CuPy) for 100k+ journeys\n- [ ] Distributed computing support (Ray/Dask) for multi-machine scaling\n- [ ] Comprehensive tests on larger datasets (\u003e100k journeys)\n- [ ] Additional baseline models (Markov, first-order attribution chains)\n- [ ] Custom coalition value functions (user-provided models)\n- [ ] Interactive visualization dashboard\n\n## Related Libraries\n\n- **[SHAP](https://github.com/shap/shap)** — General-purpose Shapley value library for model interpretation (KernelSHAP, TreeSHAP, DeepSHAP)\n- **[Alibi](https://github.com/SeldonIO/alibi)** — Model-agnostic explainability and fairness (includes Shapley approximations)\n- **[ELI5](https://github.com/eli5-org/eli5)** — Model interpretation library with permutation importance\n- **[Captum](https://github.com/pytorch/captum)** — PyTorch model interpretability library (supports Shapley values)\n- **[MultiTouch](https://github.com/AnalyticsEnthusiasts/MultiTouchAttribution)** — Another multi-touch attribution library (heuristics only)\n\n## References\n\nThe following papers directly inform the algorithms in this library:\n\n**Foundation**\n- Shapley, L. S. (1952). A Value for n-Person Games. In *Contributions to the Theory of Games II*, Annals of Mathematics Studies, vol. 28. Princeton University Press.\n  _The original cooperative game theory paper that defines the Shapley value axiomatically. All models in this library compute or approximate this quantity._\n\n**Attribution-specific Shapley**\n- Zhao, K., Mahboobi, S. H., \u0026 Bagheri, S. R. (2018). [Shapley Value Methods for Attribution Modeling in Online Advertising](https://arxiv.org/abs/1804.05327). arXiv:1804.05327.\n  _Direct inspiration for this library. Introduces set-based and ordered Shapley variants for the multi-touch attribution problem. Our `SimplifiedShapleyAttribution` and `OrderedShapleyAttribution` implement these directly._\n\n**Monte Carlo sampling**\n- Castro, J., Gómez, D., \u0026 Tejada, J. (2009). Polynomial calculation of the Shapley value based on sampling. *Computers \u0026 Operations Research*, 36(9), 1726–1730.\n  _Introduces the ApproShapley algorithm: estimate Shapley values by averaging marginal contributions over random permutations. This is the sampling backbone of `MonteCarloShapleyAttribution`._\n\n**Interventional Shapley \u0026 learned value functions**\n- Lundberg, S. M., \u0026 Lee, S.-I. (2017). [A Unified Approach to Interpreting Model Predictions](https://arxiv.org/abs/1705.07874). In *Advances in Neural Information Processing Systems* (NeurIPS).\n  _Introduces KernelSHAP: using a learned model as the coalition value function and sampling over coalitions. Our MC Shapley model adopts this approach (GBM as the value function)._\n- Janzing, D., Minorics, L., \u0026 Blöbaum, P. (2020). [Feature relevance quantification in explainable AI: A causal problem](https://arxiv.org/abs/1910.13413). In *Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics* (AISTATS).\n  _Distinguishes interventional Shapley (v(S) = f(binary mask)) from observational Shapley (v(S) = E[f | X_S]). We use the interventional formulation — deterministic, no background averaging — which eliminates variance and improves stability._\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fianchute%2Fshapley-attribution","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fianchute%2Fshapley-attribution","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fianchute%2Fshapley-attribution/lists"}