Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/bigdatagenomics/pacmin
Assembler for PacBio reads. Apache 2 licensed.
https://github.com/bigdatagenomics/pacmin
Last synced: about 2 months ago
JSON representation
Assembler for PacBio reads. Apache 2 licensed.
- Host: GitHub
- URL: https://github.com/bigdatagenomics/pacmin
- Owner: bigdatagenomics
- License: apache-2.0
- Created: 2014-08-30T00:51:37.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2015-03-14T04:51:10.000Z (almost 10 years ago)
- Last Synced: 2024-03-27T11:27:48.720Z (9 months ago)
- Language: Scala
- Homepage: http://www.bdgenomics.org
- Size: 247 KB
- Stars: 3
- Watchers: 7
- Forks: 3
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
PacMin
======Assembler for PacBio reads.
# Methods
We'll overlap the PacBio reads using the MinHash sketch method proposed in:
```
Berlin, Konstantin, et al. "Assembling Large Genomes with Single-Molecule Sequencing and Locality Sensitive Hashing." bioRxiv (2014): 008003.
```Once the reads are overlapped, we will assemble the reads into a string graph. String graphs are described in:
```
Myers, Eugene W. "The fragment assembly string graph." Bioinformatics 21.suppl 2 (2005): ii79-ii85.
```We do not assume that reads are "correct"; instead, we will maintain "probabilistic" overlaps between the fragments in the string graph.
Once we have obtained these probabilistic overlaps, we can estimate the ploidy of each overlap by normalizing the overlap coverage by length
and can then apply traditional genotyping methods (e.g., the likelihood estimation stages used in SAMTools) to find the concensus sequences at each overlap.# Getting Started
## Building PacMin
PacMin uses [Maven](http://maven.apache.org/) to build. To build PacMin, cd into the repository and run "mvn package".
## Running PacMin
ADAM is packaged via [appassembler](http://mojo.codehaus.org/appassembler/appassembler-maven-plugin/) and includes all necessary
dependenciesYou might want to add the following to your `.bashrc` to make running `adam` easier:
```
alias pacmin=". $PACMIN_HOME/pacmin-cli/target/appassembler/bin/pacmin"
````$PACMIN_HOME` should be the path to where you have checked PacMin out on your local filesystem.
To change any Java options (e.g., the memory settings --> "-Xmx4g", or to pass Java properties)
set the `$JAVA_OPTS` environment variable. Additional details about customizing the appassembler
runtime can be found [here](http://mojo.codehaus.org/appassembler/appassembler-maven-plugin/usage-script.html).Once this alias is in place, you can run adam by simply typing `pacmin` at the commandline.
# Getting In Touch
# License
PacMin is released under an [Apache 2.0 license](https://github.com/bigdatagenomics/PacMin/blob/master/LICENSE).
# Distribution
Snapshots of PacMin are available from the [Sonatype OSS](https://oss.sonatype.org) repository:
```
org.bdgenomics.pacmin
pacmin-core
0.0.1-SNAPSHOT
```Once we've got a release, we will publish to [Maven Central](http://search.maven.org).