Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/lczech/genesis

A library for working with phylogenetic and population genetic data.
https://github.com/lczech/genesis

c-plus-plus evolutionary-placement phylogenetic-data phylogenetic-placements phylogenetic-trees phylogenetics placement pool-sequencing population-genetics

Last synced: 6 days ago
JSON representation

A library for working with phylogenetic and population genetic data.

Awesome Lists containing this project

README

        

[![genesis](/doc/logo/logo_readme.png?raw=true "genesis")](http://genesis-lib.org/)

A library for working with phylogenetic and population genetic data.

[![CI](https://github.com/lczech/genesis/workflows/CI/badge.svg?branch=master)](https://github.com/lczech/genesis/actions)
[![Softwipe Score](https://img.shields.io/badge/softwipe-9.0/10.0-brightgreen)](https://github.com/adrianzap/softwipe/wiki/Code-Quality-Benchmark)
[![License](https://img.shields.io/badge/license-GPLv3-blue.svg)](http://www.gnu.org/licenses/gpl.html)
![Language](https://img.shields.io/badge/language-C%2B%2B11-lightgrey.svg)
[![Platforms](https://img.shields.io/conda/pn/bioconda/gappa)](https://anaconda.org/bioconda/gappa)


[![Release](https://img.shields.io/github/v/release/lczech/genesis.svg)](https://github.com/lczech/genesis/releases)
[![DOI](https://img.shields.io/badge/doi-10.1093%2Fbioinformatics%2Fbtaa070-blue)](https://doi.org/10.1093/bioinformatics/btaa070)

Features
-------------------

Genesis is a C++ library for working with phylogenetic and population genetic data:

* **Trees**
* Read, annotate and write trees in various formats.
* Versatile tree data structure that can store any data on the edges and nodes.
* Easily iterate trees with different policies (e.g., postorder, preorder).
* Directly draw trees with colored branches to SVG files.
* **Placements**
* Read, manipulate and write `jplace` files from phylogenetic placement analyses.
* Manipulate placement data: extract, filter, merge, and much more.
* Calculate distance measures (e.g., KR distance, EDPL).
* Run analyses like k-means Clustering, Squash Clustering, Edge PCA.
* Visualize aspects like read abundances or correlation with meta-data on the branches of the tree.
* **Populations**
* Read and work with genome mapping and variant formats such as `sam`/`bam`/`cram`, `pileup`, `sync`, and `vcf`, as well as auxiliary formats such as `gff`/`gtf`, `bim`/`map`, and `bed`.
* Iterate positions in a genome, individually or in different types of windows.
* Compute statistics such as Tajima's D and F_ST for pool sequencing data.
* **Sequences** and **Taxonomies**
* Read, filter, manipulate and write sequences in `fasta`, `fastq`, and `phylip` format.
* Calculate consensus sequences with different methods.
* Work with taxonomic paths and build a taxonomic hierarchy.
* **Utilities**
* Math tools (matrices, histograms, statistics functions etc)
* Color support (color lists, gradients etc, for making colored trees)
* Various supportive file formats (bmp, csv, json, xml and more)

This is just an overview of the more prominent features.
See the [API reference](http://doc.genesis-lib.org/namespaces.html) for more.

Genesis is a library that is intended for researchers and developers who want to build their own
tools and methods, or run their own custom analyses. If you are simply interested in analyzing your
data with our methods, have a look at our command line tool [Gappa](https://github.com/lczech/gappa)
for many common phylogenetic placement analyses.

Setup and Getting Started
-------------------

For download and build instructions, see **[Setup](http://doc.genesis-lib.org/setup.html)**.

You furthermore find all the information for getting started with genesis in the
**[documentation](http://doc.genesis-lib.org/)**.
It contains a user manual with setup instructions and tutorials, as well as the full API reference.

For **bug reports and feature requests** of genesis, please
[open an issue on our GitHub page](https://github.com/lczech/genesis/issues).

For **user support** of the phylogenetic placement parts of the library, please see our
[Phylogenetic Placement Google Group](https://groups.google.com/forum/#!forum/phylogenetic-placement).
It is intended for discussions about phylogenetic placement,
and for user support for our software tools, such as [EPA-ng](https://github.com/Pbdas/epa-ng)
and [Gappa](https://github.com/lczech/gappa).

Showcases
-------------------

A focus point of the library is to work with phylogenetic placements.
The following figure summarized the placement position of 7.5 mio short reads on a
reference tree with 190 taxa. The color code indicates the number of reads placed
on each branch.

![Phylogenetic tree with coloured branches.](/doc/png/placement/visualize_placements.png?raw=true "Phylogenetic tree with coloured branches.")

This and other methods are presented in our manuscripts

> Methods for Inference of Automatic Reference Phylogenies and Multilevel Phylogenetic Placement.

> Lucas Czech, Pierre Barbera, and Alexandros Stamatakis.

> Bioinformatics, 2018. https://doi.org/10.1093/bioinformatics/bty767

>

and

> Scalable Methods for Analyzing and Visualizing Phylogenetic Placement of Metagenomic Samples.

> Lucas Czech and Alexandros Stamatakis.

> PLOS One, 2019. https://doi.org/10.1371/journal.pone.0217050

>

See there for more on what Genesis can do.

Citation
-------------------

When using Genesis, please cite

> Genesis and Gappa: processing, analyzing and visualizing phylogenetic (placement) data.

> Lucas Czech, Pierre Barbera, and Alexandros Stamatakis.

> Bioinformatics, 2020. https://doi.org/10.1093/bioinformatics/btaa070

Also, see [Gappa](https://github.com/lczech/gappa) for our command line tool to run your own analyses.