Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/adamtaranto/quick-ortho-fetch
Take an XML formatted multiblast result, extract genbank IDs for best x hits to each query and download a non-redundant list of matching seqs from ncbi.
https://github.com/adamtaranto/quick-ortho-fetch
Last synced: about 2 months ago
JSON representation
Take an XML formatted multiblast result, extract genbank IDs for best x hits to each query and download a non-redundant list of matching seqs from ncbi.
- Host: GitHub
- URL: https://github.com/adamtaranto/quick-ortho-fetch
- Owner: Adamtaranto
- Created: 2013-11-08T01:57:14.000Z (about 11 years ago)
- Default Branch: master
- Last Pushed: 2014-01-23T06:39:26.000Z (almost 11 years ago)
- Last Synced: 2024-10-13T20:45:33.627Z (3 months ago)
- Language: Python
- Size: 155 KB
- Stars: 0
- Watchers: 5
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Quick-Ortho-Fetch
================================What: Provides raw material for building phyogenies from paraphyletic gene families.
How: Extracts a non-redundant list of best hits from a multi-query blast and fetches matching fasta seqs from NCBI.
Example
-------------------------python quickOrtho.py myMultiBlast.xml protein [email protected] -n 40 -e 1e-5 -o outputFile.fas -t -q
Takes the top 40 unique gids for each query from xml, outputs non-redundant .fas file containing sequences.
Options
-------------------------usage: quickOrtho.py file_dir {protein,nucleotide} email [-h] [-n NUMBER_UNIQUE_GIDS] [-e E_VALUE_THRESHOLD] [-o OUTPUT_DIR] [-t table][-q quiet]
ArgNameDescription
[1]file_dirDirectory to NCBI .xml file.
[2]{protein,nucleotide}Which database to direct entrez query to.
[3]emailEmail for entrez record retrieval, tells NCBI who you are.
-eE_VALUE_THRESHOLDMaximum e-value allowed in screening, enter as decimal or in scientific notation (eg. 1e-20). Default = 1e-20
-nNUMBER_UNIQUE_GIDSNumber of unique gids to extract for each query. Default = 50
-oOUTPUT_DIRSet name of output fasta file. Default = "'input_dir'_quickOrthoResults.fas"
-ttableCreates a new .txt table summarising top hits for each query. Writes file to "'output_dir'_summaryTable.txt". Recommend viewing this file in a text editor without text wrapping.
-qquietRuns the program in quiet mode, with no running feedback
-hhelpPrint help message and exit