Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/edawson/sigprofilerhelper
Helper scripts for the SigProfiler suite of mutational signature analysis tools
https://github.com/edawson/sigprofilerhelper
Last synced: about 2 months ago
JSON representation
Helper scripts for the SigProfiler suite of mutational signature analysis tools
- Host: GitHub
- URL: https://github.com/edawson/sigprofilerhelper
- Owner: edawson
- Created: 2020-02-26T18:11:26.000Z (almost 5 years ago)
- Default Branch: main
- Last Pushed: 2024-02-09T22:46:47.000Z (11 months ago)
- Last Synced: 2024-02-10T04:37:38.704Z (11 months ago)
- Language: Python
- Size: 15.6 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
SigProfilerHelper
-----------------
Eric T. Dawson
February 2020## Introduction
SigProfilerHelper provides a set of python command line interface implementations
to the SigProfiler suite of tools (i.e., SigProfilerMatrixGenerator, SigProfilerExtractor, etc).## Installation
```bash
pip install -r requirements.txt
```## Basic usage
If you have not already installed a reference, you'll need to install one of the available references from SigProfiler
using the `install_reference.py` script. Here's how to install GRCh38 (GRCh37 is available in the same manner):### Installing a reference genome:
```bash
python install_reference.py --ref GRCh38
```
### Generating SigProfiler Matrices
To generate the SBS / ID / DNP matrices using SigProfilerMatrixGenerator and place them in a directory called "sigprof\_input":```bash
python generate_matrix.py -m -d sigprof_input
```
### Creating seeds for the extraction (optional)
```bash
python create_seeds.py -o seeds.txt -n 1000
```
### Running SigProfilerExtractor
To run SBS96 analysis for 1 to 7 signatures (1000 iterations each) on 16 threads:```bash
time python scripts/run_sigprofiler.py --gpu --min-iters 1000 --max-iters 10000 -s 1 -e 7 -t sigprof_input/output/SBS/PROJECT.SBS96.all -d
```
### Full working example with public data:
```bash
## Download and unzip the test data
wget https://public.betulalabs.com/test-data/data_mutations_mod.maf.gz
gunzip data_mutations_mod.maf.gz## Install your reference
python install_reference.py --ref GRCh37## Generate the matrix
python generate_matrix.py --maf data_mutations_mod.maf --ref GRCh37 -d data_mutations_output --project Example## Run sigprofiler
python run_sigprofiler.py -t test_output/output/SBS/PROJECT.SBS96.all -d signatures -s 1 -e 3 --max-iters 10 --min-iters 3 --nmf-replicates 4
```