https://github.com/hammerlab/discohorts

Generate Cohorts based on Epidisco and/or Biokepi results
https://github.com/hammerlab/discohorts

Last synced: 12 months ago
JSON representation

Generate Cohorts based on Epidisco and/or Biokepi results

Host: GitHub
URL: https://github.com/hammerlab/discohorts
Owner: hammerlab
Created: 2017-02-24T21:49:38.000Z (over 9 years ago)
Default Branch: master
Last Pushed: 2017-09-25T18:27:17.000Z (over 8 years ago)
Last Synced: 2025-01-08T12:56:35.548Z (over 1 year ago)
Language: Python
Size: 44.9 KB
Stars: 1
Watchers: 7
Forks: 0
Open Issues: 9
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # discohorts

Generate Cohorts based on Epidisco and/or Biokepi results

## Example Usage

```

from utils.data import load_cohort

from discohorts import Discohort

cohort = load_cohort(min_tumor_mtc=0, min_normal_mtc=0)

work_dirs = ["/nfs-pool-{}/biokepi/".format(i) for i in range(2, 17)]

dest_results_dir = "/nfs-pool/biokepi/results"

cohort = Discohort(

    cohort,

    biokepi_work_dirs=work_dirs,

    dest_results_dir=dest_results_dir)

cohort.add_epidisco_pipeline(

    pipeline_name="epidisco_1")

# Customize the normal BAM input.

cohort.add_epidisco_pipeline(

    pipeline_name="epidisco_2",

    config=EpidiscoConfig(cohort,

                          arg_normal_input=lambda patient: patient.normal_sample.bam_path_dna))

# Customize the normal BAM input and the run name.

cohort.add_epidisco_pipeline(

    pipeline_name="epidisco_3",

    run_name=lambda patient: "epidisco_{}".format(patient.id),

    config=EpidiscoConfig(cohort,

                          arg_normal_input=lambda patient: patient.normal_sample.bam_path_dna))

# Customize the normal BAM input, the run name, and which patients to run on.

cohort.add_epidisco_pipeline(

    pipeline_name="epidisco_4",

    run_name=lambda patient: "epidisco_{}".format(patient.id),

    config=EpidiscoConfig(cohort,

                          keep=lambda patient: patient.id == "468",

                          arg_normal_input=lambda patient: patient.normal_sample.bam_path_dna))

# Same as #4, but written differently.

class EpidiscoConfigModified(EpidiscoConfig):

    def keep(self, patient):

        return patient.id == "468"

    def arg_normal_input(self, patient):

        return patient.tumor_sample.bam_path_dna

modified_config = EpidiscoConfigModified(cohort)

cohort.add_epidisco_pipeline(

    pipeline_name="epidisco_5",

    config=modified_config)

    

# Customize any CLI argument; for example, picard_java_max_heap_size.

cohort.add_epidisco_pipeline(

    pipeline_name="epidisco_6",

    config=EpidiscoConfig(cohort,

                          arg_picard_java_max_heap_size="20g"))

# This is Discohort's own dry run functionality, FYI. Should have a better name.

cohort.run_pipeline("epidisco_1", dry_run=True)

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/hammerlab/discohorts

Awesome Lists containing this project

README