https://github.com/zsailer/phylopandas

Pandas DataFrames for phylogenetics
https://github.com/zsailer/phylopandas

biopython pandas phylogenetics python

Last synced: about 2 months ago
JSON representation

Pandas DataFrames for phylogenetics

Host: GitHub
URL: https://github.com/zsailer/phylopandas
Owner: Zsailer
License: bsd-3-clause
Created: 2017-10-24T19:38:59.000Z (about 8 years ago)
Default Branch: master
Last Pushed: 2021-06-06T13:21:02.000Z (over 4 years ago)
Last Synced: 2025-09-25T05:31:58.216Z (about 2 months ago)
Topics: biopython, pandas, phylogenetics, python
Language: Python
Homepage:
Size: 982 KB
Stars: 74
Watchers: 5
Forks: 21
Open Issues: 6
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          

[![Gitter chat](https://badges.gitter.im/gitterHQ/gitter.png)](https://gitter.im/phylopandas/Lobby)

[![Documentation Status](http://readthedocs.org/projects/phylopandas/badge/?version=latest)](http://phylopandas.readthedocs.io/en/latest/?badge=latest)

[![Build Status](https://travis-ci.org/Zsailer/phylopandas.svg?branch=master)](https://travis-ci.org/Zsailer/phylopandas)

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/Zsailer/phylopandas/master?filepath=examples%2Fintro-notebook.ipynb)

**Bringing the [Pandas](https://github.com/pandas-dev/pandas) `DataFrame` to phylogenetics.**

PhyloPandas provides a Pandas-like interface for reading sequence and phylogenetic tree data into pandas DataFrames. This enables easy manipulation of phylogenetic data using familiar Python/Pandas functions. Finally, phylogenetics for humans!



## How does it work?

Don't worry, we didn't reinvent the wheel. **PhyloPandas** is simply a [DataFrame](https://github.com/pandas-dev/pandas)

(great for human-accessible data storage) interface on top of [Biopython](https://github.com/biopython/biopython) (great for parsing/writing sequence data) and [DendroPy](https://github.com/jeetsukumaran/DendroPy) (great for reading tree data).

PhyloPandas does two things:

1. It offers new `read` functions to read sequence/tree data directly into a DataFrame.

2. It attaches a new `phylo` **accessor** to the Pandas DataFrame. This accessor provides writing methods for sequencing/tree data (powered by Biopython and dendropy).

## Basic Usage

**Sequence data:**

Read in a sequence file.

```python

import phylopandas as ph

df1 = ph.read_fasta('sequences.fasta')

df2 = ph.read_phylip('sequences.phy')

```

Write to various sequence file formats.

```python

df1.phylo.to_clustal('sequences.clustal')

```

Convert between formats.

```python

# Read a format.

df = ph.read_fasta('sequences.fasta')

# Write to a different format.

df.phylo.to_phylip('sequences.phy')

```

**Tree data:**

Read newick tree data

```python

df = ph.read_newick('tree.newick')

```

Visualize the phylogenetic data (powered by [phylovega](https://github.com/Zsailer/phylovega)).

```python

df.phylo.display(

    height=500,

)

```



## Contributing

If you have ideas for the project, please share them on the project's [Gitter chat](https://gitter.im/phylopandas/Lobby).

It's *easy* to create new read/write functions and methods for PhyloPandas. If you

have a format you'd like to add, please submit PRs! There are many more formats

in Biopython that I haven't had the time to add myself, so please don't be afraid

to add them! I thank you ahead of time!

## Testing

PhyloPandas includes a small [pytest](https://docs.pytest.org/en/latest/) suite. Run these tests from base directory.

```

$ cd phylopandas

$ pytest

```

## Install

Install from PyPI:

```

pip install phylopandas

```

Install from source:

```

git clone https://github.com/Zsailer/phylopandas

cd phylopandas

pip install -e .

```

## Dependencies

- [BioPython](https://github.com/biopython/biopython): Library for managing and manipulating biological data.

- [DendroPy](https://github.com/jeetsukumaran/DendroPy): Library for phylogenetic scripting, simulation, data processing and manipulation

- [Pandas](https://github.com/pandas-dev/pandas): Flexible and powerful data analysis / manipulation library for Python

- [pandas_flavor](https://github.com/Zsailer/pandas_flavor): Flavor pandas objects with new accessors using pandas' new register API (with backwards compatibility).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/zsailer/phylopandas

Awesome Lists containing this project

README