{"id":19519423,"url":"https://github.com/hammerlab/cohorts","last_synced_at":"2025-04-26T07:31:14.837Z","repository":{"id":62563448,"uuid":"53869966","full_name":"hammerlab/cohorts","owner":"hammerlab","description":"Utilities for analyzing mutations and neoepitopes in patient cohorts","archived":false,"fork":false,"pushed_at":"2018-06-07T20:38:26.000Z","size":568,"stargazers_count":20,"open_issues_count":67,"forks_count":4,"subscribers_count":11,"default_branch":"master","last_synced_at":"2025-04-11T21:49:28.238Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hammerlab.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2016-03-14T15:52:59.000Z","updated_at":"2022-09-21T13:29:40.000Z","dependencies_parsed_at":"2022-11-03T15:45:15.502Z","dependency_job_id":null,"html_url":"https://github.com/hammerlab/cohorts","commit_stats":null,"previous_names":["tavinathanson/cohorts"],"tags_count":23,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hammerlab%2Fcohorts","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hammerlab%2Fcohorts/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hammerlab%2Fcohorts/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hammerlab%2Fcohorts/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hammerlab","download_url":"https://codeload.github.com/hammerlab/cohorts/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250953407,"owners_count":21513332,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-11T00:18:10.255Z","updated_at":"2025-04-26T07:31:14.511Z","avatar_url":"https://github.com/hammerlab.png","language":"Python","readme":"[![PyPI](https://img.shields.io/pypi/v/cohorts.svg?maxAge=21600)]() [![Build Status](https://travis-ci.org/hammerlab/cohorts.svg?branch=master)](https://travis-ci.org/hammerlab/cohorts) [![Coverage Status](https://coveralls.io/repos/hammerlab/cohorts/badge.svg?branch=master\u0026service=github)](https://coveralls.io/github/hammerlab/cohorts?branch=master)\n\nCohorts\n=======\n\nCohorts is a library for analyzing and plotting clinical data, mutations and neoepitopes in patient cohorts.\n\nIt calls out to external libraries like [topiary](https://github.com/hammerlab/topiary) and caches the results for easy manipulation.\n\nCohorts requires Python 3 (3.3+). We are no longer maintaining compatability with Python 2. For context, see this [Python 3 statement](www.python3statement.org).\n\nInstallation\n------------\n\nYou can install Cohorts using [pip](https://pip.pypa.io/en/latest/quickstart.html):\n\n```bash\npip install cohorts\n```\n\nFeatures\n--------\n\n* Data management: construct a `Cohort` consisting of `Patient`s with `Sample`s.\n* Use `varcode` and `topiary` to generate and cache variant effects and predicted neoantigens.\n* Provenance: track the state of the world (package and data versions) for a given analysis.\n* Aggregation functions: built-in functions such as `missense_snv_count`, `neoantigen_count`, `expressed_neoantigen_count`; or create your own functions.\n* Plotting: survival curves via `lifelines`, response/no response plots (with Mann-Whitney and Fisher's Exact results), ROC curves. Example: `cohort.plot_survival(on=missense_snv_count, how=\"pfs\")`.\n* Filtering: filter collections of variants/effects/neoantigens by, for example, variant statistics.\n* Pre-define data sets to work with. Example: `cohort.as_dataframe(join_with=[\"tcr\", \"pdl1\"])`.\n\nIn addition, several other libraries make use of `cohorts`:\n* [pygdc](http://github.com/hammerlab/pygdc)\n* [query_tcga](http://github.com/jburos/query_tcga)\n\nQuick Start\n---------------\n\nOne way to get started using Cohorts is to use it to analyze TCGA data.\n\nAs an example, we can create a cohort using [query_tcga](http://github.com/jburos/query_tcga):\n\n```python\nfrom query_tcga import cohort, config\n\n# provide authentication token\nconfig.load_config('config.ini')\n\n# load patient data\nblca_patients = cohort.prep_patients(project_name='TCGA-BLCA',\n                                     project_data_dir='data')\n\n# create cohort\nblca_cohort = cohort.prep_cohort(patients=blca_patients,\n                                 cache_dir='data-cache')\n```\n\nThen, use `plot_survival()` to summarize a potential biomarker (e.g. `snv_count`) by survival:.\n\n```python\nfrom cohorts.functions import snv_count\nblca_cohort.plot_survival(snv_count, how='os', threshold='median')\n```\n\nWhich should produce a summary of results including this plot:\n\n![Survival plot example](/docs/survival_plot_example.png)\n\nWe could alternatively use `plot_benefit()` to summarize OS\u003e12mo instead of survival:\n\n```python\nblca_cohort.plot_benefit(snv_count)\n```\n\n![Benefit plot example](/docs/benefit_plot_example.png)\n\n\nSee the full example in the [quick-start notebook](http://nbviewer.jupyter.org/github/hammerlab/tcga-blca/blob/master/Quick-start%20-%20using%20Cohorts%20with%20TCGA%20data.ipynb)\n\nBuilding from Scratch\n--------------\n\n```python\npatient_1 = Patient(\n    id=\"patient_1\",\n    os=70,\n    pfs=24,\n    deceased=True,\n    progressed=True,\n    benefit=False\n)\n    \npatient_2 = Patient(\n    id=\"patient_2\",\n    os=100,\n    pfs=50,\n    deceased=False,\n    progressed=True,\n    benefit=False\n)\n\ncohort = Cohort(\n    patients=[patient_1, patient_2],\n    cache_dir=\"/where/cohorts/results/get/saved\"\n)\n\ncohort.plot_survival(on=\"os\")\n```\n\n```python\nsample_1_tumor = Sample(\n    is_tumor=True,\n    bam_path_dna=\"/path/to/dna/bam\",\n    bam_path_rna=\"/path/to/rna/bam\"\n)\n\npatient_1 = Patient(\n    id=\"patient_1\",\n    ...\n    snv_vcf_paths=[\"/where/my/mutect/vcfs/live\",\n                   \"/where/my/strelka/vcfs/live\"]\n    indel_vcfs_paths=[...],\n    tumor_sample=sample_1_tumor,\n    ...\n)\n\ncohort = Cohort(\n    ...\n    patients=[patient_1]\n)\n\n```\n","funding_links":[],"categories":["زیست شناسی و بیوتکنولوژی"],"sub_categories":["کار با زمان و تقویم"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhammerlab%2Fcohorts","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhammerlab%2Fcohorts","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhammerlab%2Fcohorts/lists"}