Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/benhg/extract-phylogenetic-marker-homologs
https://github.com/benhg/extract-phylogenetic-marker-homologs
Last synced: 19 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/benhg/extract-phylogenetic-marker-homologs
- Owner: benhg
- Created: 2020-03-25T18:23:06.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2020-03-27T22:17:41.000Z (almost 5 years ago)
- Last Synced: 2024-11-08T21:41:29.568Z (2 months ago)
- Language: Python
- Size: 148 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# extract-phylogenetic-marker-homologs
1. search the transcriptome subsets that are >= to 80% the length of the query for each of the phylogenetic markers then
2. search those hits against the transcriptomes using the programs you and Ellen developed for mapping against the reads and building confidence in the taxon source. then
For step 1, am I essentially creating a new blast database, which is made up of only transcripts longer than 80% of the length of each phylogenetic marker
11:31
then running BLAST with that marker against that modified databaseThen taking those hts and plugging them into STAR (step 2)
NB: `awk 'BEGIN{RS=">"}NR>1{sub("\n","\t"); gsub("\n",""); print RS$0}' sufficient_length.fasta | awk '!seen[$1]++' | awk -v OFS="\n" '{print $1,$2}' > deduped.fasta` to deduplicate fasta
NB: `makeblastdb -in deduped.fasta -out 16S_Periegops -parse_seqids -dbtype nucl` to make blast db
NB: `blastn -db 12s_Drymusa -query ../../input_sequences.fasta -out results.out` basic blast command