{"id":29704223,"url":"https://github.com/adaptinfer/deathbyroundnumbers","last_synced_at":"2025-07-23T14:07:16.113Z","repository":{"id":114003257,"uuid":"502735917","full_name":"AdaptInfer/DeathByRoundNumbers","owner":"AdaptInfer","description":"Glass-box ML reveals biases in medical practice at round number thresholds","archived":false,"fork":false,"pushed_at":"2023-02-23T18:15:09.000Z","size":8380,"stargazers_count":2,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-06-04T10:53:29.891Z","etag":null,"topics":["ehr","interpretable-machine-learning"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AdaptInfer.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-06-12T21:51:15.000Z","updated_at":"2024-11-18T16:10:11.000Z","dependencies_parsed_at":null,"dependency_job_id":"49570798-b1fe-4f08-8219-b2ca557373e3","html_url":"https://github.com/AdaptInfer/DeathByRoundNumbers","commit_stats":null,"previous_names":["lengerichlab/deathbyroundnumbers","blengerich/deathbyroundnumbers","adaptinfer/deathbyroundnumbers"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/AdaptInfer/DeathByRoundNumbers","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AdaptInfer%2FDeathByRoundNumbers","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AdaptInfer%2FDeathByRoundNumbers/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AdaptInfer%2FDeathByRoundNumbers/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AdaptInfer%2FDeathByRoundNumbers/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AdaptInfer","download_url":"https://codeload.github.com/AdaptInfer/DeathByRoundNumbers/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AdaptInfer%2FDeathByRoundNumbers/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266691580,"owners_count":23969182,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-23T02:00:09.312Z","response_time":66,"last_error":null,"robots_txt_status":null,"robots_txt_updated_at":null,"robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ehr","interpretable-machine-learning"],"created_at":"2025-07-23T14:07:15.402Z","updated_at":"2025-07-23T14:07:16.103Z","avatar_url":"https://github.com/AdaptInfer.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# [Death by Round Numbers: Glass-Box Machine Learning Uncovers Biases in Medical Practice](https://www.medrxiv.org/content/10.1101/2022.04.30.22274520v2)\n\nReal-world evidence is confounded by treatments, so data-driven systems can learn to recapitulate biases that influenced treatment decisions. This confounding presents a challenge: uninterpretable black-box systems can put patients at risk by confusing treatment benefits with intrinsic risk, but also an opportunity: interpretable “glass-box” models can improve medical practice by highlighting unexpected patterns which suggest biases in medical practice.\n\nThis repo contains examples of how to find these statistical artifacts and biases in a pneumonia dataset and MIMIC-II, MIMIC-III\u003c and MIMIC-IV.\n\n## Automated Search for Statistical Artifacts\n\nIt makes use of two automated tools:\n- `find_and_plot_discontinuities`, which automatically finds discontinuous effects in your data.\n- `find_and_plot_non_monotonicities`, which automatically finds non-monotone effects in your data.\n\nThe first tool is summarized in (B) below, and the second tool is summarized in (C) below:\n![Preview](images/model_and_tests.png)\n\nBoth of these tools are available in the [ebm_utils](https://github.com/blengerich/ebm_utils) package. This package can be installed via:\n`pip install git+https://github.com/blengerich/ebm_utils`. And the tools are located in `ebm_utils.analysis.changepoints`.\n\n\n\n## Citing\n\nIf you use these ideas, code, or results, please cite:\n```\n@article{lengerich2022death,\n  title={Death by Round Numbers: Glass-Box Machine Learning Uncovers Biases in Medical Practice},\n  author={Lengerich, Benjamin J and Caruana, Rich and Nunnally, Mark E and Kellis, Manolis},\n  journal={medRxiv},\n  year={2022},\n  publisher={Cold Spring Harbor Laboratory Press}\n}\n```\nThe manuscript is currently available on [Medrxiv](https://www.medrxiv.org/content/10.1101/2022.04.30.22274520v2). \n\n![Preview](images/Figure1.png)\nFigure 1: Confounding effects are treacherous to data-driven risk models, but confounding effects that are revealed by\nglass-box models can be useful by suggesting potential improvements in medicine. (A) Underlying “treatment effects”\nconfound risk models. Causal arrows are filled, observed variables are shown in gray ovals and unobserved variables\nin white boxes. Data-driven analyses often estimate P(Outcome|Biomarker), but this is only a faithful surrogate\nfor P(Outcome|Underlying Risk) if treatments were to have negligible impacts. In reality, treatments (broadly interpreted, including monitoring, therapeutics, diagnostics, and patient behavior) have large impacts on outcomes. When\nexplicitly analyzed for treatment effectiveness, effects of randomly-assigned treatments are desirable as evidence of a\nproposed treatment being effective, but strong treatment effects confound estimation of risk. To build models which\ncan effectively guide treatment decisions, we require intelligible models and medical domain knowledge to understand\nif all relevant confounders have been sufficiently corrected. (B) An example of strong, but useful, confounding: the\nmortality risk of pneumonia patients falls with extremely high levels of serum creatinine (which indicates kidney failure), even after correcting for other risk factors in a multivariable predictive model. This counter-causal relationship\nsuggests confounding, and the sharp inflections at the round numbers of 3mg/dL and 5mg/dL (denoted by black vertical lines) suggest that this association is guided by discrete treatment thresholds rather than smooth biomedical risk\nfactors. While this confounding between risk factor and clinical decisions is a challenge for data-driven analysis, it is\nalso an opportunity because the unexpected inflections alert us to possibilities of optimizing treatment\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fadaptinfer%2Fdeathbyroundnumbers","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fadaptinfer%2Fdeathbyroundnumbers","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fadaptinfer%2Fdeathbyroundnumbers/lists"}