https://github.com/nci-gdc/htseq-tool
Utility scripts for processing HTSeq counts data
https://github.com/nci-gdc/htseq-tool
bioinformatics docker workflow-tool
Last synced: 3 months ago
JSON representation
Utility scripts for processing HTSeq counts data
- Host: GitHub
- URL: https://github.com/nci-gdc/htseq-tool
- Owner: NCI-GDC
- License: apache-2.0
- Created: 2016-01-07T18:59:27.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2023-03-29T19:17:02.000Z (about 3 years ago)
- Last Synced: 2025-09-05T13:51:20.268Z (10 months ago)
- Topics: bioinformatics, docker, workflow-tool
- Language: Python
- Homepage:
- Size: 374 KB
- Stars: 3
- Watchers: 9
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
# GDC Base Template



Contains dockerfile and HTSeq utility tools for running the GDC HTSeq
expression quantification workflows. See https://github.com/NCI-GDC/htseq-cwl
for more information about the GDC workflow associated with this.
## Usage
### `gene_lengths`
Generate the gene exon lengths from GTF for use in normalization.
```
usage: htseq-tools gene_lengths [-h] --gtf_file GTF_FILE --out_file OUT_FILE
Extract gene exon lengths from GTF
optional arguments:
-h, --help show this help message and exit
--gtf_file GTF_FILE GTF file used for htseq count
--out_file OUT_FILE Output TSV file to write to
```
### `merge_counts`
When an aliquot has a mixture of paired- and single-end, this will
merge the counts to a single file.
```
usage: htseq-tools merge_counts [-h] --htseq_counts HTSEQ_COUNTS --out_file
OUT_FILE
Merge PE and SE counts from HTseq
optional arguments:
-h, --help show this help message and exit
--htseq_counts HTSEQ_COUNTS
HTSeq count file. Use multiple times
--out_file OUT_FILE Output TSV file to write to
```
### `fpkm`
Normalize the raw counts using the FPKM and FPKM-UQ methods.
```
usage: htseq-tools fpkm [-h] --aggregate_length_file AGGREGATE_LENGTH_FILE
--htseq_counts HTSEQ_COUNTS --output_prefix
OUTPUT_PREFIX
Get FPKM and FPKM-UQ
optional arguments:
-h, --help show this help message and exit
--aggregate_length_file AGGREGATE_LENGTH_FILE
Aggregate length TSV
--htseq_counts HTSEQ_COUNTS
HTSeq counts txt file
--output_prefix OUTPUT_PREFIX
The output prefix to use.
```