{"id":19749529,"url":"https://github.com/mskcc/facets-suite","last_synced_at":"2025-04-30T09:31:27.031Z","repository":{"id":44753907,"uuid":"43065402","full_name":"mskcc/facets-suite","owner":"mskcc","description":"Utility functions for FACETS","archived":false,"fork":false,"pushed_at":"2022-09-06T16:53:25.000Z","size":100168,"stargazers_count":30,"open_issues_count":21,"forks_count":24,"subscribers_count":73,"default_branch":"master","last_synced_at":"2023-10-19T18:18:17.648Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mskcc.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-09-24T12:28:43.000Z","updated_at":"2023-10-19T18:18:17.649Z","dependencies_parsed_at":"2022-08-22T10:00:46.942Z","dependency_job_id":null,"html_url":"https://github.com/mskcc/facets-suite","commit_stats":null,"previous_names":[],"tags_count":24,"template":null,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mskcc%2Ffacets-suite","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mskcc%2Ffacets-suite/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mskcc%2Ffacets-suite/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mskcc%2Ffacets-suite/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mskcc","download_url":"https://codeload.github.com/mskcc/facets-suite/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224204922,"owners_count":17273215,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-12T02:27:06.064Z","updated_at":"2024-11-12T02:27:06.748Z","avatar_url":"https://github.com/mskcc.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"# facetsSuite\n[![lifecycle](https://img.shields.io/badge/lifecycle-maturing-blue.svg)](https://www.tidyverse.org/lifecycle/#experimental)\n[![Travis build status](https://travis-ci.org/taylor-lab/facets-suite.svg?branch=master)](https://travis-ci.org/taylor-lab/facets-suite)\n[![Coverage status](https://codecov.io/gh/taylor-lab/facets-suite/branch/master/graph/badge.svg)](https://codecov.io/github/taylor-lab/facets-suite?branch=master)\n\n**See the [release notes](https://github.com/taylor-lab/facets-suite/releases/tag/2.0.0-beta) for information on the new facet-suite version. Backwards compatibility is currently limited, as documented [here](https://github.com/taylor-lab/facets-suite/wiki/5.-Backwards-compatibility).**\n\nfacetsSuite is an R package with functions to run [FACETS](https://github.com/mskcc/facets)—an allele-specific copy-number caller for paired tumor-normal DNA-sequencing data from genome-wide and targeted assays. facetSuite both wraps the code to execute the FACETS algorithm itself as well as performs _post-hoc_ analyses on the resulting data. This package was developed by members of the [Taylor lab](https://www.mskcc.org/research-areas/labs/barry-taylor) and the Computational Sciences group within the [Center for Molecular Oncology at Memorial Sloan Kettering Cancer Center](https://www.mskcc.org/research-programs/molecular-oncology).\n\n## Installation\n\nYou can install facetsSuite in R from this repository with:\n\n``` r\ndevtools::install_github(\"mskcc/facets-suite\")\n```\n\nAlso follow the [instructions for installing FACETS](https://github.com/mskcc/facets).\n\n_Note: For the wrapper script `snp-pileup-wrapper.R` you need to specify the variable `snp_pileup_path` in the script to point to the installation path of snp-pileup _**or**_ set the environment variable SNP_PILEUP. Alternatively, the [docker](README.md#run-wrappers-from-container) image contains the executable._\n\n# Usage\n\n## R functions\n\nThe R functions in this package are documented and their description and usage is available in R by doing:\n```r\n?facetsSuite::function_name\n```\n\nCentral to most functionality in the package is the output from the `run_facets`, which runs the FACETS algorithm based on provided tumor-normal SNP pileup (i.e. genotyping). The output is a list object with the following named objects:\n- `snps`: SNPs used for copy-number segmentation, where `het==1` indicates heterozygous loci.\n- `segs`: Inferred copy-number segmentation.\n– `purity`: Inferred sample purity, i.e. fraction of tumor cells of the total cellular population.\n- `ploidy`: Inferred sample ploidy.\n- `diplogr`: Inferred dipLogR, the sample-specific baseline corresponding to the diploid state.\n- `alballogr`: Alternative dipLogR value(s) at which a balanced solution was found.\n- `flags`: Warning flags from the naïve segmentation algorithm.\n- `em_flags`: Warning flags from the expectation-maximization segmentation algorithm.\n- `loglik`: Log-likelihood value of the fitted model.\n\nNote that FACETS performs segmentation with two algorithms, the \"naïve\" base method and an expectation-maximization algorithm. The latter (columns suffixed `.em`) is used as a default for most of the functions in this package.\n\n## Wrapper scripts\n\nMost use of this package can be done from the command line using three wrapper scripts:\n- `snp-pileup-wrapper.R`:\\\n    This wraps the `snp-pileup` C++ script that genotypes sites across the genome in both normal and tumor samples. The output from this is the input to FACETS. Most default input arguments are appropriate regardless of usage, but `--max-depth` may need adjustment depending on the overall depth of the samples used.\\\n    Example command:\n    ```shell\n    snp-pileup-wrapper.R \\\n        --snp-pileup-path \u003cpath to snp-pileup executable\u003e \\\n        --vcf-file \u003cpath to SNP VCF\u003e \\\n        --normal-bam normal.bam \\\n        --tumor-bam tumor.bam \\\n        --output-prefix \u003cprefix for output file, preferrably tumorSample__normalSample\u003e\n    ```\n    The input VCF file should contain polymorphic SNPs, so that FACETS can infer changes in allelic configuration at genomic loci from changes in allele ratios. [dbSNP](https://www.ncbi.nlm.nih.gov/snp/) is a good source for this. By default, `snp-pileup` also estimates the read depth in the input BAM files every 50th base.\n\n- `run-facets-wrapper.R`:\\\n    This wrapper takes above SNP \"pileup\" as input and executes the FACETS algorithm. The ouputs are in the form of Rdata objects, TXT files, and PNGs of the samples overall copy-number profile. The wrapper allows for running FACETS in a two-pass mode, where first a \"purity\" run estimates the overall segmentation profile, sample purity and ploidy, and subsequently the dipLogR value from this run seeds a \"high-sensitivity\" run which may detect more focal events. To run in the two-pass mode, specify the input arguments prefixed by `purity`. The cval (`--purity-cval` and `--cval`) parameters tune the segmentation coarseness.\\\n    Example command:\n    ```shell\n    run-facets-wrapper.R \\\n        --counts-file tumor_normal.snp_pileup.gz \\\n        --sample-id tumorID__normalID \\\n        --purity-cval 1000 --cval 500 \\\n        --everything\n    ```\n    The above command runs FACETS in the two-pass mode, first at cval 1000, then at cval 500 based on the sample-specific baseline found at the higher cval. The full suite of analysis and QC is run with the `--everything` flag. If no output directory is specified, a directory named `sample-id` is created.\n\n- `annotate-maf-wrapper.R`:\\\n    This script estimates the cancer-cell fractions (CCFs) of somatic mutations using purity and ploidy estimates from FACETS. It requires a input MAF file and a mapping of sample names in the MAF file (column `Tumor_Sample_Barcode`) to FACETS output RDS files (i.e. file paths). Alternatively, it can be run in a single-sample mode by pointing direct to the RDS and providing a MAF file with only mutation calls for the given sample.\\\n    Example command:\n    ```shell\n    annotate-maf-wrapper.R \\\n        --maf-file somatic_mutations.maf\n        --facets-output \u003cpath to facets_output.rds\u003e\n    ```\n    Or run with a mapping file as input (`--sample-mapping`), in the following format:\n    ```shell\n    \u003e cat sample_map.txt\n    sample      file\n    SampleA     SampleA_facets.rds\n    SampleB     SampleB_facets.rds\n    ...         ...\n    ```\n\nAll three wrappers use [argparse](https://github.com/trevorld/r-argparse) for argument handling and can thus be run with `--help` to see the all input arguments.\n\n## Run wrappers from container\n\nIn order to run the containerized versions of the wrapper scripts, first pull the [docker image](https://cloud.docker.com/u/philipjonsson/repository/docker/philipjonsson/facets-suite):\n```shell\n## Docker\ndocker pull philipjonsson/facets-suite:dev\n\n## Singularity\nsingularity pull --name facets-suite-dev.img docker://philipjonsson/facets-suite:dev\n```\n\nThen run either of the scripts as such:\n```shell\n## Docker\ndocker run -it -v $PWD:/work philipjonsson/facets-suite:dev run-facets-wrapper.R \\\n    --counts-file work/SampleA.snp_pileup.gz \\\n    --sample-id SampleA \\\n    --directory work\n\n## Singularity\nsingularity run facets-suite-dev.img run-facets-wrapper.R \\\n    --counts-file SampleA.snp_pileup.gz \\\n    --sample-id SampleA\n```\nFor **Docker**, note the binding (`-v`) of the current directory on the host to the directory named `work` inside the container. This is required for the input file, in the current directory, to be accessible inside the container. This, in its turn requires the output to be written to `work` inside the container so that it is available on the host once the script has executed. Singularity always mounts the directory from which it is being executed.\n\nThe image contains the `snp-pileup` executable used by `snp-pileup-wrapper.R`, so it can be run without specifying its path. Example for **Singularity**:\n```shell\nsingularity run -B \u003cpath to BAMs\u003e -B \u003cpath to VCF\u003e facets-suite-dev.img snp-pileup-wrapper.R \\\n    --vcf-file \u003cpath to VCF\u003e/dbsnp.vcf \\\n    --normal-bam \u003cpath to BAMs\u003e/NormalA.bam \\\n    --tumor-bam \u003cpath to BAMs\u003e/TumorA.bam \\\n    --output-prefix TumorA__NormalA\n```\n_Note: The binding of full paths to any files outside of the run directory is necessary._\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmskcc%2Ffacets-suite","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmskcc%2Ffacets-suite","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmskcc%2Ffacets-suite/lists"}