{"id":16536602,"url":"https://github.com/tillahoffmann/summaries2","last_synced_at":"2026-06-23T13:33:06.770Z","repository":{"id":176529976,"uuid":"658134264","full_name":"tillahoffmann/summaries2","owner":"tillahoffmann","description":null,"archived":false,"fork":false,"pushed_at":"2025-10-20T19:15:09.000Z","size":539,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-05-05T08:39:18.820Z","etag":null,"topics":["approximate-bayesian-computation","likelihood-free-inference","simulation-based-inference","summary-statistics"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tillahoffmann.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2023-06-24T22:00:37.000Z","updated_at":"2025-10-20T19:15:13.000Z","dependencies_parsed_at":null,"dependency_job_id":"c616df70-2212-431d-9d24-19a745dd2a93","html_url":"https://github.com/tillahoffmann/summaries2","commit_stats":null,"previous_names":["tillahoffmann/summaries2"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/tillahoffmann/summaries2","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tillahoffmann%2Fsummaries2","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tillahoffmann%2Fsummaries2/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tillahoffmann%2Fsummaries2/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tillahoffmann%2Fsummaries2/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tillahoffmann","download_url":"https://codeload.github.com/tillahoffmann/summaries2/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tillahoffmann%2Fsummaries2/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34691890,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-23T02:00:07.161Z","response_time":65,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["approximate-bayesian-computation","likelihood-free-inference","simulation-based-inference","summary-statistics"],"created_at":"2024-10-11T18:32:19.554Z","updated_at":"2026-06-23T13:33:06.765Z","avatar_url":"https://github.com/tillahoffmann.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Minimizing the Expected Posterior Entropy Yields Optimal Summaries [![](https://github.com/onnela-lab/summaries/actions/workflows/main.yaml/badge.svg)](https://github.com/onnela-lab/summaries/actions/workflows/main.yaml)\n\nThis repository contains code and data to reproduce the results presented in the manuscript [*Unifying Summary Statistic Selection for Approximate Bayesian Computation*](https://doi.org/10.48550/arXiv.2206.02340).\n\nFigures and tables can be regenerated by executing the following steps:\n\n- Ensure a recent Python version is installed; this code has been tested with Python 3.12 on Ubuntu and macOS.\n- Install [uv](https://docs.astral.sh/uv/) if not already available on your system.\n- Install the Python dependencies by executing `uv sync` from the root directory of the repository.\n- Install [CmdStan](https://mc-stan.org/users/interfaces/cmdstan) by executing `uv run python -m cmdstanpy.install_cmdstan --version 2.36.0`. Other recent versions of CmdStan may also work but have not been tested.\n- Optionally, verify the installation by executing `uv run pytest -v`.\n- Execute `uv run cook exec \"*:evaluation\"` which will run all experiments and generate evaluation metrics which are saved at `workspace/[experiment name]/evaluation.csv`.\n- Execute each of the Jupyter notebooks (saved as markdown files) in the `notebooks` folder to generate the figures.\n\nResults Structure\n-----------------\n\nAfter running the experiments (see above), the `workspace` folder contains all results. It is structured as follows, and the folder structure is repeated for each experiment.\n\n```python\nbenchmark-large  # One folder for each experiment.\n    data  # Train, validation, and test split as pickle files; other temp files may also be present.\n        test.pkl\n        train.pkl\n        validation.pkl\n        ...\n    samples  # (Approximate) posterior samples as pickle files.\n        [sampler configuration name].pkl\n        ...\n    transformers  # Trained transformers, e.g., posterior mean estimators, as pickle files.\n        [transformer configuration name]-[digits].pkl  # One of three replications with diff. seeds.\n        [transformer configuration name].pkl  # Best transformer amongst the three replications.\n    evaluation.csv  # Evaluation of different summary statistic extraction methods.\nbenchmark-small\n    ...\ncoalescent\n    ...\ntree-large\n    ...\ntree-large\n    ...\nfigures  # Contains PDF figures after executing notebooks.\n```\n\nEach `evaluation.csv` file has seven columns:\n- `path` which refers to one of the methods used to extract summaries.\n- three columns `{nlp,rmise,mise}` which are best estimates of negative log probability loss, root mean integrated squared error, and mean integrated squared error, respectively. The estimates are obtained by averaging over all samples in the corresponding test set.\n- three columns `{nlp,rmise,mise}_err` which are standard errors obtained as `sqrt(var / (n - 1))`, where `var` is the variance of the metric in the test set, and `n` is the size of the test set.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftillahoffmann%2Fsummaries2","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftillahoffmann%2Fsummaries2","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftillahoffmann%2Fsummaries2/lists"}