Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/dillondaudert/proteindatasets

Creating and manipulating various protein sequence-structure datasets using Python, Julia, and other tools.
https://github.com/dillondaudert/proteindatasets

bioinformatics biopython blast dataset dssp fasta julia jupyter jupyter-notebook pandas protein psiblast python3 secondary structure tensorflow uniref50

Last synced: about 1 month ago
JSON representation

Creating and manipulating various protein sequence-structure datasets using Python, Julia, and other tools.

Awesome Lists containing this project

README

        

# Protein sequence and structure datasets

This repo contains scripts for creating various protein sequence and structure
datasets, as well as some guides for how to use them.

## Contents

### proteinfeatures
Protein amino acid features.

### cpdb
Working with the cullPDB dataset created in [Zhou & Troyanskaya,
2014](https://arxiv.org/abs/1403.1347).

### cpdb2
Creating a new protein sequence-structure dataset following the methods used for
the cullPDB dataset, referred to as cpdb2.

### psiblast
Scripts for calling NCBI+ psiblast on large fasta files from BioPython and
handling the results using multiprocessing.