An open API service indexing awesome lists of open source software.

https://github.com/borgwardtlab/proteinshake

Protein structure datasets for machine learning.
https://github.com/borgwardtlab/proteinshake

Last synced: 11 months ago
JSON representation

Protein structure datasets for machine learning.

Awesome Lists containing this project

README

          


Important note: ProteinShake has broadened its scope and is continued in other repositories



Check out our latest work on the bioverse (all biomolecules) and rnaglib (for RNA)

We will continue to maintain existing functionality in this repository, please submit issues or PRs with any bugs or suggestions.

---






![build](https://img.shields.io/github/actions/workflow/status/borgwardtlab/proteinshake/build.yml?color=%2303A9F4&style=for-the-badge)
[![pypi](https://img.shields.io/pypi/v/proteinshake?color=%2303A9F4&style=for-the-badge)](https://pypi.org/project/proteinshake/)
[![docs](https://img.shields.io/readthedocs/proteinshake?color=%2303A9F4&style=for-the-badge)](https://proteinshake.readthedocs.io/en/latest/?badge=latest)
[![downloads](https://img.shields.io/pypi/dm/proteinshake?color=%2303A9F4&style=for-the-badge)](https://pypi.org/project/proteinshake/)
[![codecov](https://img.shields.io/codecov/c/gh/BorgwardtLab/proteinshake?color=%2303A9F4&style=for-the-badge&token=0NL6CQZ6MB)](https://codecov.io/gh/BorgwardtLab/proteinshake)

     
Quickstart     
Website     
Documentation     
Paper     
Contribute     
Leaderboard     
Tutorials

### ProteinShake provides one-liner imports of large scale, preprocessed protein structure datasets and tasks for various model types and frameworks.

We provide a collection of preprocessed and cleaned protein 3D structure datasets from RCSB and AlphaFoldDB, including annotations. Structures are easily converted to graphs, voxels, or point clouds and loaded natively into PyTorch, TensorFlow, NumPy, JAX, PyTorch Geometric, DGL and NetworkX. The task API enables standardized benchmarking on a variety of tasks on protein and residue level.

Find more information on the Website and the Documentation, or check out the Tutorials. The results of the paper and the baseline models can be found in the Evaluation Repository. If you would like to create your own release, see the Release Repository.


**Installation:**



```
pip install proteinshake
```

*Code in this repository is licensed under [BSD-3](https://github.com/BorgwardtLab/proteinshake/blob/main/LICENSE), the dataset files on Zenodo are licensed under [CC-BY-4.0](https://creativecommons.org/licenses/by/4.0/legalcode).*

*To build ProteinShake, we obtained and modified data from various sources. Please see the documentation of the respective dataset classes for a reference to the original data, license, and paper.*