Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/natir/pcon
Prompt COuNter
https://github.com/natir/pcon
Last synced: about 1 month ago
JSON representation
Prompt COuNter
- Host: GitHub
- URL: https://github.com/natir/pcon
- Owner: natir
- License: mit
- Created: 2019-01-30T08:51:55.000Z (almost 6 years ago)
- Default Branch: main
- Last Pushed: 2024-05-22T14:10:53.000Z (7 months ago)
- Last Synced: 2024-05-22T15:33:49.018Z (7 months ago)
- Language: Rust
- Homepage:
- Size: 27.4 MB
- Stars: 1
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: Readme.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
pcon
![Test](https://github.com/natir/pcon/workflows/Test/badge.svg)
![Lints](https://github.com/natir/pcon/workflows/Lints/badge.svg)
![MSRV](https://github.com/natir/pcon/workflows/MSRV/badge.svg)
[![CodeCov](https://codecov.io/gh/natir/pcon/branch/main/graph/badge.svg)](https://codecov.io/gh/natir/pcon)
[![Documentation](https://github.com/natir/pcon/workflows/Documentation/badge.svg)](https://natir.github.io/pcon/pcon)
[![License](https://img.shields.io/badge/license-MIT-green)](https://github.com/natir/pcon/blob/main/LICENSE)Prompt COuNter, a short kmer counter.
- only fasta file (or fastq file if feature is activate)
- if k is even k is reduced to the nearest lower odd number
- pcon allocate 2^(k * 2 - 1) times number bytes used by counter value (for k 19 pcon and one bytes counts required 69 go)If data contains something other than A C T or G is consider like A C T or G (check the 2nd and the 3rd bit of lettre, N was consider as G for exemple)
## Installation
### Features
Pcon use rust feature to control some behavior at build time.
#### Maximum of count
Size of variable used to store kmer count is control by *count_\** features.
1. *count\_u16*: count on 2 bytes max value 65535
2. *count\_u32*: count on 4 bytes max value 4294967295
3. *count\_u64*: count on 8 bytes max value 18446744073709551615
4. *count\_u8*: count on 1 bytes max value 255If you set multiple count\_\* feature priority of feature follow previous list order.
#### Parallel
If you set feature parallel you activate pcon parallel feature, based on [rayon](https://docs.rs/rayon/latest/rayon/) and [rust atomic](https://doc.rust-lang.org/core/sync/atomic/index.html) type.
#### Kff
Activate Kmer File Format output.
#### Fastq
Pcon can read fastq file format.
#### Default
*count\_u8* is the only default features.
### From source
```bash
git clone https://github.com/natir/pcon.git
cd pcon
cargo install --path . --features {features1},{features2}
```### With cargo
```bash
cargo install https://github.com/natir/pcon --features {features1},{features2}
```## Usage
### Count
By default `pcon count` read input fasta file from stdin and write count in stdout in pcon internal format.
```
-k, --kmer-size Size of kmer
-i, --inputs Path to inputs, default read stdin
-p, --pcon Path where count are store, default write in stdout
-c, --csv Path where count are store
-s, --solid Path where count are store
-a, --abundance Minimal abundance, default value 0
-b, --record_buffer Number of sequence record load in buffer, default 8192
```Count 7-mer in `example.fasta` file and write result in pcon format in `example.pcon` file:
```bash
pcon count -k 7 -i example.fasta -p example.pcon
```### MiniCount
By default `pcon mini-count` read input fasta file from stdin and write count in stdout in pcon internal format. About memory usage, `pcon mini-count` has a minimum ram usage of 2^(m * 2 - 1) times number of bytes used by counter value, m are the size of the minimizer,
```
-k, --kmer-size Size of kmer
-m, --minimizer-size Size of minimizer
-i, --inputs Path to inputs, default read stdin
-f, --formats Format of input, default fasta [possible values: fasta]
-c, --csv Path where count are store
-a, --abundance Minimal abundance, default value 0
-A, --mini-abundance Minimal minimizer abundance, default value 2
-b, --record_buffer Number of sequence record load in buffer, default 8192
```Count 31-mer in `example.fasta` file if 15-mer is present more than 2 and write result in pcon format in `example.csv` file:
```bash
pcon count -k 31 -m 15 -A 2 -i example.fasta -c example.csv
```### Dump
By default `pcon dump` read input pcon file from stdin and write count in csv format in stdout.
```
-i, --inputs Path to inputs, default read stdin
-c, --csv Path where count are store, default write in stdout
-p, --pcon Path where count are store
-s, --solid Path where count are store
-a, --abundance Minimal abundance, default value 0
```Convert 7-mer count in `example.pcon` in csv file `example.csv`:
```bash
pcon dump -i example.pcon -c example.csv
```### Not subcommand parameter
```
-q, --quiet Silence all output
-v, --verbosity... Verbose mode (-v, -vv, -vvv, etc)
-T, --timestamp Timestamp (sec, ms, ns, none)
-h, --help Print help
-V, --version Print version
```## Minimum supported Rust version
Currently the minimum supported Rust version is 1.74.