Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/kaz-yos/multinomial-ps-trimming

Multinomial Propensity Score Trimming (Am J Epidemiol 2018)
https://github.com/kaz-yos/multinomial-ps-trimming

epidemiology propensity-score statistics

Last synced: 18 days ago
JSON representation

Multinomial Propensity Score Trimming (Am J Epidemiol 2018)

Awesome Lists containing this project

README

        

* Multinomial Propensity Score Trimming

[[./src/title_figure.png]]

This is the code repository for our upcoming paper on propensity score (PS) trimming in settings with multiple (3 or more) treatment groups. In this paper, we proposed definitions to extend three existing PS trimming methods ([[https://www.jstor.org/stable/27798811][Crump et al. Biometrika 2009;96:187]], [[https://www.ncbi.nlm.nih.gov/pubmed/20716704][Stürmer et al. Am J Epidemiol 2010;172:843]], and [[https://www.dovepress.com/a-tool-for-assessing-the-feasibility-of-comparative-effectiveness-rese-peer-reviewed-article-CER][Walker et al. Comp Eff Res 2013;3:11]]) for multinomial exposures.

* Related resources
- Paper: in press
- ICPE 2018 poster: [[https://github.com/kaz-yos/icpe-2018-org-mode-poster]]
- Web app: https://kaz-yos.shinyapps.io/shiny_trim_ternary/
- Web app source code: https://github.com/kaz-yos/shiny-trim-ternary

* Files and folders in this repo
- =*.R=: Main R script files for generating simulation data, analyzing data, and reporting results. Execution each file will generate a plain text report file named *.R.txt under =log/=.
- =*_o2.sh=: Example shell scripts for the Linux SLURM batch job system. These are designed for Harvard Medical School's O2 cluster specifically, and are not expected to work without modification elsewhere.
- =data/=: Folder for simulation data. Due to file size issues, only the summary file for running =06_assess_results.R= is kept.
- =log/=: Folder for log files.
- =out/=: Folder for figure PDFs.

* Simulation
** Installation
Two custom R packages must be installed before running the R scripts provided in this repository.
#+BEGIN_SRC sh
## Install devtools (if you do not have it already)
install.packages("devtools")
## Install the data generation package
devtools::install_github(repo = "kaz-yos/datagen3")
## Install the simulation package
devtools::install_github(repo = "kaz-yos/trim3")
#+END_SRC
Additionally, packages =doParallel=, =doRNG=, =grid=, =gtable=, and =tidyverse=, are required in the scripts.

** Data generation
Running the following will generate raw data files under the =data/= folder using 8 cores.
#+BEGIN_SRC sh
Rscript ./01_generate_data.R 8
#+END_SRC

If you have access to a SLURM-based computing cluster, the following can be used with appropriate modifications to the shell script.
#+BEGIN_SRC sh
sh ./01_generate_data_o2.sh
#+END_SRC

** Data preparation
Running the following will process the specified data file (PS estimation and trimming) and generate a new file with the same name except that raw changes to prepared.
#+BEGIN_SRC sh
Rscript ./02_prepare_data.R ./data/scenario_raw001_part001_r50.RData 8
#+END_SRC

If you have access to a SLURM-based computing cluster, the following can be used with appropriate modifications to the shell script. This will dispatch a SLURM job for each file.
#+BEGIN_SRC sh
sh ./02_prepare_data_o2.sh ./data/scenario_raw*
#+END_SRC

** Data analysis
Running the following will analyze the specified data file (outcome model estimation) and generate a new file with the same name except that prepared changes to analyzed.
#+BEGIN_SRC sh
Rscript ./03_analyze_data.R ./data/scenario_prepared001_part001_r50.RData 8
#+END_SRC

If you have access to a SLURM-based computing cluster, the following can be used with appropriate modifications to the shell script. This will dispatch a SLURM job for each file.
#+BEGIN_SRC sh
sh ./03_analyze_data_o2.sh ./data/scenario_prepared*
#+END_SRC

** Result aggregation
Running the following will aggregate all the analyzed data files under =data/= and generate a single new file named =all_analysis_results.RData=.
#+BEGIN_SRC sh
Rscript ./04_aggregate_results.R 1
#+END_SRC

If you have access to a SLURM-based computing cluster, the following can be used with appropriate modifications to the shell script.
#+BEGIN_SRC sh
sh ./04_aggregate_results_o2.sh
#+END_SRC

** Result summarization
Running the following will summarize the results (scenario-level summaries) and generate a new file named =all_analysis_summary.RData=.
#+BEGIN_SRC sh
Rscript ./05_summarize_results.R ./data/all_analysis_results.RData 1
#+END_SRC

If you have access to a SLURM-based computing cluster, the following can be used with appropriate modifications to the shell script.
#+BEGIN_SRC sh
sh ./05_summarize_results_o2.sh ./data/all_analysis_results.RData
#+END_SRC

** Assessment
Running the following will create figures under =out/= describing the summary statistics in =all_analysis_summary.RData=.
#+BEGIN_SRC sh
./Rscriptee ./06_assess_results.R ./data/all_analysis_summary.RData 1
#+END_SRC

* Author
[[https://twitter.com/kaz_yos][Kazuki Yoshida]]