https://github.com/jennalandy/gridsemble_paper

Last synced: 3 months ago
JSON representation

Host: GitHub
URL: https://github.com/jennalandy/gridsemble_paper
Owner: jennalandy
Created: 2024-01-22T17:06:18.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-01-24T01:50:17.000Z (over 1 year ago)
Last Synced: 2025-01-17T22:25:41.900Z (5 months ago)
Language: R
Size: 283 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Gridsemble: Selective Ensembling for False Discovery Rates

For the `gridsemblefdr` R software package, see the [jennalandy/gridsemblefdr](https://github.com/jennalandy/gridsemblefdr) repository.

This repository contains the code to replicate all results reported in the paper [*Gridsemble: Selective Ensembling for False Discovery Rates*](https://arxiv.org/abs/2401.12865). See details in the Simulation Study and Experimental Application sections of our paper.

### Simulation Studies

Our simulation studies are in R scripts. When each script is run, it will log progress and results in a new sub-directory. Scripts assume you are in the `simulation_studies` directory. We use [`sink`](https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/sink) to log progress; if you terminate a test early you will need to run `sink()` for output to show up in the console again. Note that these scripts take many hours to run.

**Simulation studies presented in Figure 1**

These tests compare `gridsemble`, partial implementations, and benchmarks on Symmetric, Asymmetric, and Curated Ovarian Data-Based simulation studies.

- `symmetric_test.R`
- `asymmetric_test.R`
- `cod_based_test.R`

**Simulation studies presented in Figure 3**

These tests compare `gridsemble` and `ensemble` with varying number of synthetic datasets and model size in the Symmetric and Asymmetric simulation studies.

- `inc_n_symmetric.R`
- `inc_n_asymmetric.R`

**Simulation studies presented in Supplementary Figure 2**

These tests compare `gridsemble`, partial implementations, and benchmarks on Symmetric, Asymmetric, and COD-Based simulation studies using random search in place of grid search.

These scripts can only be run after their grid search counterparts.

- `symmetric_test_random.R`
- `asymmetric_test_random.R`
- `cod_based_test_random.R`
- `get_random_grid.R`: functions to construct grids for a random search.

**Scripts sourced by the above**

- `test.R`: defines wrapper functions to run simulation study given a data generating function.
- `evaluate.R`: functions to compute metrics given fdr estimates and ground truth.
- `simulate.R`: functions to simulate each type of data.
- `utils.R`: other utility functions.

### Experimental Application

Our experimental application relies on the Platinum Spike dataset¹. We use [quarto](https://quarto.org/) documents which can be edited and run with RStudio, Jupyter Lab, or Visual Studio Code.

**Notebooks**

- [`PAPER_platinum_data`](https://github.com/jennalandy/gridsemble_PAPER/blob/main/experimental/PAPER_platinum_data.pdf): download and pre-process Platinum Spike data. This needs to be run before either of the analysis documents.
- [`PAPER_platinum_run_subsets`](https://github.com/jennalandy/gridsemble_PAPER/blob/main/experimental/PAPER_platinum_run_subsets.pdf): analyses on subsets of Platinum Spike data with $\pi_0 \in [0.6, 0.95]$, used to create Figure 2.
- [`PAPER_platinum_run_all_data`](https://github.com/jennalandy/gridsemble_PAPER/blob/main/experimental/PAPER_platinum_run_all_data.pdf): analyses on the full Platinum Spike dataset, used to create Supplementary Figure 3 and Supplementary Table 2.

**Other**

- `PAPER_metrics_helpers.R`: functions to calculate metrics and helper functions.

### References

[1] Q. Zhu, J.C. Miecznikowski, and M.S. Halfon. Preferred analysis methods for affymetrix genechips. II. an expanded, balanced, wholly-defined spike-in dataset. *BMC Bioinformatics*, 11:285, 2010. doi:https://doi.org/10.1186/1471-2105-11-285.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/jennalandy/gridsemble_paper

Awesome Lists containing this project

README