Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/cumbof/minimizers
A Python package for extracting minimizers from sequence data
https://github.com/cumbof/minimizers
fasta genome minimizer sketch
Last synced: 26 days ago
JSON representation
A Python package for extracting minimizers from sequence data
- Host: GitHub
- URL: https://github.com/cumbof/minimizers
- Owner: cumbof
- License: mit
- Created: 2023-04-21T17:30:42.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-04-26T03:48:30.000Z (over 1 year ago)
- Last Synced: 2024-11-16T09:48:37.179Z (about 1 month ago)
- Topics: fasta, genome, minimizer, sketch
- Language: Python
- Homepage:
- Size: 9.77 KB
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# minimizers
A Python package for extracting minimizers from sequence data.
## Requirements
The package requires Python 3 and there are no constraints on the type of operating system.
It also requires the [biopython](https://pypi.org/project/biopython/) package.
## Install
It can be installed with `pip` by typing the following command in your terminal:
```
pip install minimizers
```## How to use it
Run `minimizers --help` for a list of available arguments:
```
usage: minimizers [-h] [-a] -i INPUT [-o OUTPUT] [-t {list,fasta}] -s SIZE -w
WINDOW [--report-counts] [--top-perc TOP_PERC]
[--top-num TOP_NUM] [-n NPROC] [--verbose] [-v]Extract the set of minimizers from a sequence file
optional arguments:
-h, --help show this help message and exit
-a, --aggregate Aggregate record results (default: False)
-i INPUT, --input INPUT
Path to the input sequence file in fasta format. It
can be Gzip compressed (default: None)
-o OUTPUT, --output OUTPUT
Path to the output file with minimizers. Results are
printed on the stdout if no output is provided
(default: None)
-t {list,fasta}, --output-type {list,fasta}
The output can be formatted as a list of kmers or as a
fasta file (default: list)
-s SIZE, --size SIZE Length of the minimizers (default: None)
-w WINDOW, --window WINDOW
Size of the sliding window. It must be greater than
the minimizer size (default: None)
--report-counts Report the frequencies of the minimizers. This is
compatible with "--output-type list" only (default:
False)
--top-perc TOP_PERC Report the top percentage of minimizers based on their
frequency (default: None)
--top-num TOP_NUM Report the top number of minimizers based on their
frequency (default: None)
-n NPROC, --nproc NPROC
Make it parallel (default: 1)
--verbose Print messages on the stdout (default: False)
-v, --version Print the "minimizers" version and exit
```Copyright © 2022 [Fabio Cumbo](https://github.com/cumbof). See [LICENSE](https://github.com/cumbof/minimizers/blob/main/LICENSE) for additional details.