https://github.com/sjackman/abyss-drosophila-melanogaster
:microscope: Assemble Drosophila melanogaster with ABySS
https://github.com/sjackman/abyss-drosophila-melanogaster
abyss assembly drosophila genome
Last synced: 2 months ago
JSON representation
:microscope: Assemble Drosophila melanogaster with ABySS
- Host: GitHub
- URL: https://github.com/sjackman/abyss-drosophila-melanogaster
- Owner: sjackman
- License: mit
- Created: 2017-08-31T22:01:27.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2017-11-30T20:12:13.000Z (over 7 years ago)
- Last Synced: 2025-01-21T20:48:40.406Z (4 months ago)
- Topics: abyss, assembly, drosophila, genome
- Language: HTML
- Homepage: http://sjackman.ca/abyss-drosophila-melanogaster/dmelanogaster.samtobreak.nb.html
- Size: 645 KB
- Stars: 3
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Assemble *Drosophila melanogaster* with ABySS
I assembled an Illumina short-read sequencing data set of *Drosophila melanogaster* with ABySS 2.0.1. The data set includes a 2x101 paired-end library and a 2x110 mate-pair library. I assembled the reads with and without adapter trimming the mate-pair library using [NxTrim (doi:10.1101/007666)](https://github.com/sequencing/NxTrim), and compared the results of these two methods. I optimized *k* and *N* and selected the best assembly (see Methods). I aligned the assembled scaftigs to the reference (BDGP6) using [BWA-MEM](https://github.com/lh3/bwa) and calculated assembly metrics using `abyss-samtobreak` and `abyss-fac`. I find that the assembly of mate-pair reads trimmed with NxTrim yields an assembly that is both more contiguous (scaffold NGA50) and more correct (fewer breakpoints compared to the reference).
# Data
* Paired-end:
* Mate-pair:# Results
+ [Report of assembly metrics](http://sjackman.ca/abyss-drosophila-melanogaster/dmelanogaster.samtobreak.nb.html) using RMarkdown
+ Assembly metrics of `abyss-samtobreak`: [dmelanogaster.samtobreak.tsv](dmelanogaster.samtobreak.tsv)
+ Assembly metrics of `abyss-abyss-fac`: [dmelanogaster.abyss-fac.tsv](dmelanogaster.abyss-fac.tsv)# Methods
I optimized *k* by trying every value between 32 and 64 with a step size of 8 and selected the assembly with the best NGA50.
I optimized *N* by trying every value between 10 and 25 and selecting the assembly with the best N50.```sh
cd nxtrim/abyss/k48
for n in 5 {10..25}; do
echo n=$n
abyss-scaffold -k48 -G143725995 -s1000-10000 -n$n dmelanogaster-6.dot mp1-6.dist.dot 2>&1 >/dev/null | tail -n1
done
```# Pipeline
[](Makefile.pdf)