{"id":19448459,"url":"https://github.com/neurodata/causal_batch","last_synced_at":"2025-04-25T02:31:12.381Z","repository":{"id":218552518,"uuid":"746164028","full_name":"neurodata/causal_batch","owner":"neurodata","description":null,"archived":false,"fork":false,"pushed_at":"2024-11-02T08:20:58.000Z","size":10098,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":4,"default_branch":"main","last_synced_at":"2024-11-02T09:19:40.061Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/neurodata.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"citation.bib","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-01-21T08:56:07.000Z","updated_at":"2024-11-02T08:21:02.000Z","dependencies_parsed_at":"2024-02-21T22:29:45.375Z","dependency_job_id":"dbada8c1-b1cb-4efe-a729-165009027fe4","html_url":"https://github.com/neurodata/causal_batch","commit_stats":null,"previous_names":["neurodata/causal_batch"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/neurodata%2Fcausal_batch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/neurodata%2Fcausal_batch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/neurodata%2Fcausal_batch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/neurodata%2Fcausal_batch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/neurodata","download_url":"https://codeload.github.com/neurodata/causal_batch/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223978481,"owners_count":17235182,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-10T16:26:53.326Z","updated_at":"2025-04-25T02:31:12.373Z","avatar_url":"https://github.com/neurodata.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Causal Effect Detection and Correction\n\n[![arXiv shield](https://img.shields.io/badge/arXiv-2307.13868-red.svg?style=flat)](https://arxiv.org/abs/2307.13868)\n[![imaging neuro shield](https://img.shields.io/badge/ImagingNeuro-imag_a_00458-green.svg?style=flat)](https://direct.mit.edu/imag/article/doi/10.1162/imag_a_00458/127407)\n[![](https://cranlogs.r-pkg.org/badges/causalBatch)](https://cran.rstudio.com/web/packages/causalBatch/index.html)\n\n## Contents\n\n- [Overview](#overview)\n- [Repo Contents](#repo-contents)\n- [System Requirements](#system-requirements)\n- [Installation Guide](#installation-guide)\n- [Demo](#demo)\n- [Results and figure reproduction](#results-and-figure-reproduction)\n- [License](./LICENSE)\n- [Issues](https://github.com/ebridge2/causal_batch/issues)\n- [Citation](#citation)\n\n\n# Overview\n\nBatch effects, undesirable sources of variance across multiple experiments, present significant challenges for scientific and clinical discoveries. Specifically, batch effects can (i) produce spurious signals and/or (ii) obscure genuine signals, contributing to the ongoing reproducibility crisis. Typically, batch effects are modeled as classical, rather than causal, statistical effects. This model choice renders the methods unable to differentiate between biological or experimental sources of variability, leading to unnecessary false positive and negative effect detections and over-confidence. We formalize batch effects as causal effects to address these concerns, and augment existing batch effect detection and correction approaches with causal machinery. Simulations illustrate that our causal approaches mitigate spurious findings and reveal otherwise obscured signals as compared to non-causal approaches. Applying our causal methods to a large neuroimaging mega-study reveals instances where prior art confidently asserts that the data do not support the presence of batch effects when we expect to detect them. On the other hand, our causal methods correctly discern that there exists irreducible confounding in the data, so it is unclear whether differences are due to batches or not. This work therefore provides a framework for understanding the potential capabilities and limitations of analysis of multi-site data using causal machinery.\n\n# Repo Contents\n\n- [R](./R): `R` package code.\n- [docs](./docs): usage of the `causalBatch` package on many real and simulated data examples for scientific articles.\n- [man](./man): package manual for help in R session.\n- [tests](./tests): `R` unit tests written using the `testthat` package.\n- [vignettes](./vignettes): `R` vignettes for R session html help pages.\n\n\n# System Requirements\n\n## Hardware Requirements\n\nThe `causalBatch` package requires only a standard computer with enough RAM to support the operations defined by a user. For minimal performance, this will be a computer with about 2 GB of RAM. For optimal performance, we recommend a computer with the following specs:\n\nRAM: 16+ GB  \nCPU: 4+ cores, 3+ GHz/core\n\nThe runtimes below are generated using a computer with the recommended specs (16 GB RAM, 4 cores@3 GHz) and internet of speed 100 Mbps.\n\n## Software Requirements\n\n### OS Requirements\n\nThe package development version is tested on *Mac* operating systems. The developmental version of the package has been tested on the following systems:\n\nLinux: \nMac OSX:  Ventura 13.1\nWindows:  \n\nBefore setting up the `causalBatch` package, users should have `R` version 4.2.0 or higher, and several packages set up from CRAN.\n\n# Installation Guide\n\n## Stable Release\n\nThe stable release of the package is available on CRAN, and can be installed from `R` as:\n\n```\ninstall.packages('causalBatch')\n```\n\n## Development Version\n\n### Package dependencies\n\nUsers should install the following packages prior to installing `lolR`, from an `R` terminal:\n\n```\ninstall.packages(c('cdcsis', 'MatchIt', 'nnet', 'dplyr', 'tidyverse', 'magrittr'))\n\nif (!require(\"BiocManager\", quietly = TRUE))\n    install.packages(\"BiocManager\")\nBiocManager::install(version = \"3.18\")\n\nBiocManager::install(\"sva\")\n```\n\nwhich will install in about 1 minute on a machine with the recommended specs.\n\nThe `causalBatch` package functions with all packages in their latest versions as they appear on `CRAN` on January 22, 2024. The versions of software are, specifically:\n\n```\nsva=3.50.0\ncdcsis=2.0.3\ntidyverse=2.0.0\ndplyr=1.1.4\nMatchIt=4.5.5\nnnet=7.3.19\nmagrittr=2.0.3\n```\n\nIf you are having an issue that you believe to be tied to software versioning issues, please drop us an [Issue](https://github.com/neurodata/causal_batch/issues). \n\n### Package Installation\n\nFrom an `R` session, type:\n\n```\nrequire(devtools)\n\n# install causalBatch with the vignettes\ninstall_github('neurodata/causal_batch', build_vignettes=TRUE, force=TRUE)\n\nrequire(causalBatch)\n# view one of the basic vignettes\nvignette(\"causal_simulations\", package=\"causalBatch\") \n```\n\nThe package should take approximately 40 seconds to install with vignettes on a recommended computer. \n\n# Demo\n\nFor interactive demos of the functions, please check out the vignettes built into the package. They can be accessed as follows:\n\n```\nrequire(causalBatch)\nvignette(\"causal_simulations\", package=\"causalBatch\")\nvignette(\"causal_balancing\", package=\"causalBatch\")\nvignette(\"causal_cdcorr\", package=\"causalBatch\")\nvignette(\"causal_ccombat\", package=\"causalBatch\")\n```\n\n# Results and figure reproduction\n\nSee [Batch Effects Paper](https://github.com/neurodata/causal_batch/tree/main/docs/batch_effects_paper) for instructions to reproduce figures from Bridgeford et al. (2025). \n\n# Citation\n\nFor usage of the package and associated manuscript, please cite according to the enclosed [citation.bib](./citation.bib).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fneurodata%2Fcausal_batch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fneurodata%2Fcausal_batch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fneurodata%2Fcausal_batch/lists"}