https://github.com/bfssi-bioinformatics-lab/seqpresenceabsence
Package for checking for the presence/absence of markers against a set of samples
https://github.com/bfssi-bioinformatics-lab/seqpresenceabsence
Last synced: over 1 year ago
JSON representation
Package for checking for the presence/absence of markers against a set of samples
- Host: GitHub
- URL: https://github.com/bfssi-bioinformatics-lab/seqpresenceabsence
- Owner: BFSSI-Bioinformatics-Lab
- License: mit
- Created: 2018-11-20T14:18:39.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2022-12-08T01:18:21.000Z (over 3 years ago)
- Last Synced: 2025-01-26T19:14:05.667Z (over 1 year ago)
- Language: Python
- Size: 69.3 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# seqPresenceAbsence
### Requirements
- Python >= 3.6
- [ncbi-blast+](https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastDocs&DOC_TYPE=Download) (makeblastdb and blastn must be in your $PATH)
- [MUSCLE](https://www.drive5.com/muscle/) (muscle must be in your $PATH)
### Installation
```
pip install seqPresenceAbsence
```
### Usage
```
Usage: seqPresenceAbsence [OPTIONS]
seqPresenceAbsence is a simple script for querying an input nucleotide
FASTA file against a database of sequences. Will return an .xlsx and .csv
report of presence/absence of the sequences. Version: 0.2.0.
Options:
-i, --indir PATH Path to directory containing FASTA files you want
to query [required]
-t, --targets PATH Path to multi-FASTA containing targets of
interest [required]
-o, --outdir PATH Root directory to store all output files
[required]
-p, --perc_identity FLOAT Equivalent to the -perc_identity argument in
blastn. Defaults to 95.00.
-k, --keep_db_seqs Set this flag to keep the target sequence in
addition to the query sequence from BLAST.
-v, --verbose Set this flag to enable more verbose logging.
--version Specify this flag to print the version and exit.
--help Show this message and exit.
```
### References
- Edgar, R.C. (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput
Nucleic Acids Res. 32(5):1792-1797