https://github.com/bhklab/analyze_readii_outputs

Code for analyzing outputs from the READII package or the readii-orcestra pipeline.
https://github.com/bhklab/analyze_readii_outputs

Last synced: 3 months ago
JSON representation

Code for analyzing outputs from the READII package or the readii-orcestra pipeline.

Host: GitHub
URL: https://github.com/bhklab/analyze_readii_outputs
Owner: bhklab
License: mit
Created: 2024-10-01T13:46:22.000Z (8 months ago)
Default Branch: main
Last Pushed: 2024-12-03T02:50:22.000Z (6 months ago)
Last Synced: 2024-12-30T03:20:50.499Z (5 months ago)
Language: Jupyter Notebook
Size: 5.98 MB
Stars: 1
Watchers: 2
Forks: 0
Open Issues: 7
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # analyze_readii_outputs

Code for analyzing outputs from the READII package or the readii-orcestra pipeline.

# Workflow Contents

* shell script for setting up the directory structure

* Python Jupyter Notebook for pre-processing the clinical and image features data and setting up the pre-existing radiomic signatures

* R notebook for performing feature selection and CPH modeling

# Data Organization

Raw data (currently what _would_ be the deconstructed output from ORCESTRA object)

```raw

└──  rawdata

    └──  {DATASETNAME}

        ├──  clinical

        ├──  fmcib_outputs

        └──  readii_outputs

        {DATASETNAME}_READII-RADIOMICS_MAE.RDS

```

Processed data = filtered clinical, radiomic, and deep learning features, possibly split into training and test sets

```raw

└──  procdata

    └──  {DATASETNAME}

        └──  clinical

             ├──  cleaned_filtered_clinical_{DATASETNAME}.csv

             └──  [OPTIONAL] train_test_labelled_clinical_{DATASETNAME}.csv

        └──  radiomics

             ├──  clinical

                  └──  merged_clinical_{DATASETNAME}.csv

             ├──  features

                  ├──  merged_radiomicfeatures_{image_type}_{DATASETNAME}.csv

                  └──  labelled_radiomicfeatures_only_{image_type}_{DATASETNAME}.csv

             └──  [OPTIONAL] train_test_split

                  ├──  clinical

                       └──  train_merged_clinical_{DATASETNAME}.csv

                       └──  test_merged_clinical_{DATASETNAME}.csv

                  ├──  train_features

                       └──  train_labelled_radiomicfeatures_only_{image_type}_{DATASETNAME}.csv

                  └──  test_features

                       └──  test_labelled_radiomicfeatures_only_{image_type}_{DATASETNAME}.csv

        └──  deep_learning

             ├──  clinical

                  └──  merged_clinical_{DATASETNAME}.csv

             ├──  features

                  └──  merged_fmcibfeatures_{image_type}_{DATASETNAME}.csv

                  └──  labelled_fmcibfeatures_only_{image_type}_{DATASETNAME}.csv

             └──  [OPTIONAL] train_test_split

                  ├──  clinical

                       └──  train_merged_clinical_{DATASETNAME}.csv

                       └──  test_merged_clinical_{DATASETNAME}.csv

                  ├──  train_features

                       └──  train_labelled_fmcibfeatures_only_{image_type}_{DATASETNAME}.csv

                  └──  test_features

                       └──  test_labelled_fmcibfeatures_only_{image_type}_{DATASETNAME}.csv           

```

# TODO:

- [ ] Implement logger

- [ ] Implement ORCESTRA download

- [ ] Implement MAE deconstructor

    - [ ] Unpack clinical data --> save to csv

    - [ ] Unpack radiomic features --> save each experiment to csv

    - [ ] Unpack deep learning features --> save each experiment to csv

    - [ ] Get list of experiments, specifically the negative controls

- [ ] Make this into a snakemake pipeline

- [ ] Implement config file creation if one is not present

- [ ] Move data_setup_for_modelling from scripts into notebooks

- [ ] Finish implementing survival time and event setup as functions

- [x] Supports MRMR training over k folds

- [ ] Doesn't support MRMR training over 1 fold

- [ ] Doesn't support regular training over k folds

- [ ] Doesn't support loading model weights across k folds

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/bhklab/analyze_readii_outputs

Awesome Lists containing this project

README