An open API service indexing awesome lists of open source software.

https://github.com/haradama/phash

Software to identify known plasmid sequence from metagenomic assembly using Minhash
https://github.com/haradama/phash

bioinformatics biology contigs fasta golang metagenome metagenomics minhash plasmid plasmids

Last synced: 3 months ago
JSON representation

Software to identify known plasmid sequence from metagenomic assembly using Minhash

Awesome Lists containing this project

README

        

pHash

pHash is a software to identify known plasmid from metagenomic assembly with the very lightweight database.

## Installation
pHash is available in release page:(https://github.com/haradama/pHash/releases)

## Usage

Please download the plasmid database file on Zenodo: (http://doi.org/10.5281/zenodo.1991549)

```
Identifier of plasmid using database

Usage:
pHash identify [flags]

Flags:
-d, --db string Database
-h, --help help for identify
-i, --in string Input FASTA file
-k, --kmer int Length of k-mer (default 17)
-o, --out string Output FASTA file
-p, --paralell int Number of parallel processing (default 4)
-s, --sketch int Sketch size (default 1024)
-t, --threshold int Threshold of probability (default 10)
```

for example,
```
pHash identify -d PLASMID_DATABASE -i YOUR_METAGEMOMIC_DATA
```
If you want to build your own database, please execute the following command.
```
pHash makedb -i YOUR_PLASMID_DATA -o YOUR_DATABASE_NAME
```

## Test
```
sh ./tests/install_test_data.sh
pHash identify -d plasmidDB11062018.phash -i testData.fna
```

## License

[GNU General Public License v3.0](https://github.com/haradama/pHash/blob/master/LICENSE)