Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/benhg/extract-phylogenetic-marker-homologs

Last synced: 19 days ago
JSON representation

Host: GitHub
URL: https://github.com/benhg/extract-phylogenetic-marker-homologs
Owner: benhg
Created: 2020-03-25T18:23:06.000Z (almost 5 years ago)
Default Branch: master
Last Pushed: 2020-03-27T22:17:41.000Z (almost 5 years ago)
Last Synced: 2024-11-08T21:41:29.568Z (2 months ago)
Language: Python
Size: 148 KB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# extract-phylogenetic-marker-homologs

1. search the transcriptome subsets that are >= to 80% the length of the query for each of the phylogenetic markers then

2. search those hits against the transcriptomes using the programs you and Ellen developed for mapping against the reads and building confidence in the taxon source. then

For step 1, am I essentially creating a new blast database, which is made up of only transcripts longer than 80% of the length of each phylogenetic marker
11:31
then running BLAST with that marker against that modified database

Then taking those hts and plugging them into STAR (step 2)

NB: `awk 'BEGIN{RS=">"}NR>1{sub("\n","\t"); gsub("\n",""); print RS$0}' sufficient_length.fasta | awk '!seen[$1]++' | awk -v OFS="\n" '{print $1,$2}' > deduped.fasta` to deduplicate fasta

NB: `makeblastdb -in deduped.fasta -out 16S_Periegops -parse_seqids -dbtype nucl` to make blast db

NB: `blastn -db 12s_Drymusa -query ../../input_sequences.fasta -out results.out` basic blast command