Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/adamtaranto/quick-ortho-fetch

Take an XML formatted multiblast result, extract genbank IDs for best x hits to each query and download a non-redundant list of matching seqs from ncbi.
https://github.com/adamtaranto/quick-ortho-fetch

Last synced: about 2 months ago
JSON representation

Take an XML formatted multiblast result, extract genbank IDs for best x hits to each query and download a non-redundant list of matching seqs from ncbi.

Awesome Lists containing this project

README

        

Quick-Ortho-Fetch
================================

What: Provides raw material for building phyogenies from paraphyletic gene families.

How: Extracts a non-redundant list of best hits from a multi-query blast and fetches matching fasta seqs from NCBI.

Example
-------------------------

python quickOrtho.py myMultiBlast.xml protein [email protected] -n 40 -e 1e-5 -o outputFile.fas -t -q

Takes the top 40 unique gids for each query from xml, outputs non-redundant .fas file containing sequences.

Options
-------------------------

usage: quickOrtho.py file_dir {protein,nucleotide} email [-h] [-n NUMBER_UNIQUE_GIDS] [-e E_VALUE_THRESHOLD] [-o OUTPUT_DIR] [-t table][-q quiet]


ArgNameDescription



[1]file_dirDirectory to NCBI .xml file.



[2]{protein,nucleotide}Which database to direct entrez query to.


[3]emailEmail for entrez record retrieval, tells NCBI who you are.


-eE_VALUE_THRESHOLDMaximum e-value allowed in screening, enter as decimal or in scientific notation (eg. 1e-20). Default = 1e-20


-nNUMBER_UNIQUE_GIDSNumber of unique gids to extract for each query. Default = 50


-oOUTPUT_DIRSet name of output fasta file. Default = "'input_dir'_quickOrthoResults.fas"


-ttableCreates a new .txt table summarising top hits for each query. Writes file to "'output_dir'_summaryTable.txt". Recommend viewing this file in a text editor without text wrapping.


-qquietRuns the program in quiet mode, with no running feedback


-hhelpPrint help message and exit