{"id":37737892,"url":"https://github.com/drewschaub/protein-design-tools","last_synced_at":"2026-01-16T14:02:55.716Z","repository":{"id":238745483,"uuid":"797452321","full_name":"drewschaub/protein-design-tools","owner":"drewschaub","description":"A library of tools for protein design","archived":false,"fork":false,"pushed_at":"2025-07-15T01:47:12.000Z","size":2066,"stargazers_count":3,"open_issues_count":1,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-09-29T03:24:55.323Z","etag":null,"topics":["protein-design","structural-bioinformatics"],"latest_commit_sha":null,"homepage":"https://protein-design-tools.readthedocs.io","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/drewschaub.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-05-07T21:31:21.000Z","updated_at":"2025-09-20T11:49:37.000Z","dependencies_parsed_at":null,"dependency_job_id":"26dfa40a-4246-4b73-89d2-86abe86582aa","html_url":"https://github.com/drewschaub/protein-design-tools","commit_stats":null,"previous_names":["drewschaub/protein-design-tools"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/drewschaub/protein-design-tools","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/drewschaub%2Fprotein-design-tools","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/drewschaub%2Fprotein-design-tools/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/drewschaub%2Fprotein-design-tools/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/drewschaub%2Fprotein-design-tools/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/drewschaub","download_url":"https://codeload.github.com/drewschaub/protein-design-tools/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/drewschaub%2Fprotein-design-tools/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28479086,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-16T11:59:17.896Z","status":"ssl_error","status_checked_at":"2026-01-16T11:55:55.838Z","response_time":107,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["protein-design","structural-bioinformatics"],"created_at":"2026-01-16T14:02:55.653Z","updated_at":"2026-01-16T14:02:55.701Z","avatar_url":"https://github.com/drewschaub.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Protein-Design Tools\n\n![Banner](assets/banner.png)\n\n[![PyPI version](https://badge.fury.io/py/protein-design-tools.svg)](https://badge.fury.io/py/protein-design-tools)\n![License](https://img.shields.io/badge/license-MIT-blue.svg) \n![Python Version](https://img.shields.io/pypi/pyversions/protein-design-tools)\n\n## Table of Contents\n\n- [Overview](#overview)\n- [Features](#features)\n- [Installation](#installation)\n- [Quick Start](#quick-start)\n- [Detailed Usage](#detailed-usage)\n  - [Reading Protein Structures](#reading-protein-structures)\n  - [Analyzing Sequences](#analyzing-sequences)\n  - [Computing Structural Metrics](#computing-structural-metrics)\n  - [Generating Idealized Structures](#generating-idealized-structures)\n- [Examples](#examples)\n- [Contributing](#contributing)\n- [License](#license)\n- [Contact](#contact)\n\n## Overview\n\n**Protein-Design Tools** is a Python library tailored for structural bioinformatics, with a specific focus on protein design and engineering. It provides a suite of tools for analyzing and manipulating protein structures, enabling researchers and practitioners to perform complex structural comparisons, design new proteins, and engineer existing ones with ease.\n\nWhether you're conducting research in protein folding, designing novel enzymes, or engineering therapeutic proteins, Protein-Design Tools offers the functionalities you need to advance your projects.\n\n## Features\n\n### **Protein Structure Representation**\n- **Core Classes**:\n  - `ProteinStructure`: Represents the entire protein structure.\n  - `Chain`: Represents individual chains within the protein.\n  - `Residue`: Represents residues within chains.\n  - `Atom`: Represents individual atoms within residues.\n- **File Parsing**:\n  - **PDB Support**: Parse and read PDB files seamlessly.\n  - **CIF Support**: Future support planned for CIF files.\n- **Programmatic Construction**:\n  - Build idealized protein structures (e.g., alpha helices) programmatically.\n\n### **Structural Metrics**\nCalculate structural metrics across multiple computational frameworks for flexibility and performance optimization:\n- **RMSD (Root Mean Square Deviation)**: Measure the average distance between atoms of superimposed proteins.\n- **TM-score**: Assess structural similarity normalized by protein length.\n- **GDT-TS (Global Distance Test - Total Score)**: Evaluate global structural similarity using multiple distance thresholds.\n- **LDDT (Local Distance Difference Test)**: Measure local structural accuracy.\n\n### **Utilities**\n- **Radius of Gyration**: Compute the radius of gyration for protein structures to assess compactness.\n- **Sequence Analysis**: Extract and manipulate amino acid sequences from structures.\n\n### **Input/Output Support**\n- **File Operations**:\n  - Read and write protein structures in PDB format.\n  - Write FASTA sequences derived from 3D structure files.\n- **Data Export**:\n  - Export coordinates and other structural data in various formats, including HDF5.\n\n### **Extensible Architecture**\n- **Modular Design**: Easily add new metrics, file formats, and functionalities without disrupting existing components.\n- **Multiple Frameworks**: Leverage the strengths of NumPy, PyTorch, and JAX for computational tasks.\n\n## Installation\n\n### 1. Choose the right requirements file\n\nTo keep the repo platform-agnostic, dependencies are split into small files in  \n`requirements/`. Pick the one that matches your hardware/accelerator:\n\n| File | When to use it | Key extra deps |\n|------|----------------|----------------|\n| **`requirements/cpu.txt`**   | CPU-only | `jax[cpu]` |\n| **`requirements/cuda12.txt`**| NVIDIA GPU, CUDA 12 toolchain | `jax[cuda12]` (installs a CUDA-enabled `jaxlib` wheel) |\n| **`requirements/tpu.txt`**   | Google Cloud TPU VMs | `jax[tpu]` + `libtpu` link |\n\nAll three files include `-r requirements/base.txt`, which lists NumPy 1.26,  \nPyTorch ( CPU wheel by default ), FreeSASA, etc.\n\n### 2. Create a virtual environment (recommended)\n\n```bash\npython -m venv .venv\nsource .venv/bin/activate         # macOS/Linux\n# .venv\\Scripts\\activate.bat      # Windows CMD\n# .\\.venv\\Scripts\\Activate.ps1    # Windows PowerShell\n```\n\n### 3. Install\n\nCPU-only:\n\n```bash\npip install -r requirements/cpu.txt\n```\n\nNVIDIA GPU:\n\n```bash\npip install -r requirements/cuda12.txt\n```\n\nTPU VM:\n\n```bash\npip install -r requirements/tpu.txt\n```\n\n### 4. Verify\n\n```python\nimport numpy, torch, jax, jaxlib, freesasa\nprint(\"NumPy:\", numpy.__version__)\nprint(\"Torch:\", torch.__version__, \"| CUDA:\", torch.cuda.is_available())\nprint(\"JAX :\", jax.__version__,   \"| jaxlib:\", jaxlib.__version__)\n```\n\n\n## Quick Start\n\nHere's a quick example to get you started with Protein-Design Tools:\n\n```python\nfrom protein_design_tools.core.protein_structure import ProteinStructure\nfrom protein_design_tools.io.pdb_io import read_pdb\nfrom protein_design_tools.metrics import compute_rmsd_numpy, compute_gdt_pytorch\nfrom protein_design_tools.utils.coordinate_utils import get_coordinates, get_masses\n\n# Reading a PDB file\nprotein = read_pdb(\"path/to/file.pdb\", chains=['A', 'B'], name=\"Sample_Protein\")\n\n# Getting sequences\nsequences = protein.get_sequence_dict()\nprint(sequences)\n\n# Getting coordinates of all backbone atoms in chain A\ncoords = get_coordinates(protein, atom_type=\"backbone\", chains={'A': range(1, 21)})\n\n# Getting masses of all non-hydrogen atoms\nmasses = get_masses(protein, atom_type=\"non-hydrogen\")\n```\n\nSeveral structural metrics are available, which are accessible across multiple computational frameworks\n```python\nfrom protein_design_tools.metrics import compute_rmsd_numpy, compute_gdt_pytorch\n# Computing RMSD using NumPy\nimport numpy as np\nimport torch\n\nP = np.random.rand(1000, 3)\nQ = np.random.rand(1000, 3)\nrmsd = compute_rmsd_numpy(P, Q)\nprint(f\"RMSD (NumPy): {rmsd:.4f}\")\n\n# Computing GDT-TS using PyTorch\nP_pt = torch.tensor(P)\nQ_pt = torch.tensor(Q)\ngdt = compute_gdt_pytorch(P_pt, Q_pt)\nprint(f\"GDT-TS (PyTorch): {gdt:.2f}\")\n```\n\n## Detailed Usage\n\n### Reading Protein Structures\n\nProtein-Design Tools supports reading and parsing protein structures from PDB files. Future updates will include CIF file support.\n\n```python\nfrom protein_design_tools.io.pdb_io import read_pdb\n\n# Read all chains\nprotein = read_pdb(\"path/to/file.pdb\")\n\n# Read specific chains\nprotein = read_pdb(\"path/to/file.pdb\", chains=['A', 'B'], name=\"My_Protein\")\n```\n\n### Analyzing Sequences\n\nExtract amino acid sequences from the protein structure.\n\n```python\n# Get the sequence of each chain in the protein\nsequence_dict = protein.get_sequence_dict()\nfor chain_id, sequence in sequence_dict.items():\n    print(f\"Chain {chain_id}: {sequence}\")\n```\n\n### Computing Structural Metrics\n\nLeverage multiple frameworks to compute various structural metrics.\n\n```python\nfrom protein_design_tools.metrics import compute_rmsd_numpy, compute_gdt_pytorch\n\n# Example data\nimport numpy as np\nimport torch\n\nP = np.random.rand(1000, 3)\nQ = np.random.rand(1000, 3)\n\n# Compute RMSD using NumPy\nrmsd = compute_rmsd_numpy(P, Q)\nprint(f\"RMSD (NumPy): {rmsd:.4f}\")\n\n# Compute GDT-TS using PyTorch\nP_pt = torch.tensor(P)\nQ_pt = torch.tensor(Q)\ngdt = compute_gdt_pytorch(P_pt, Q_pt)\nprint(f\"GDT-TS (PyTorch): {gdt:.2f}\")\n```\n\n### Generating Idealized Structures\n\nCreate idealized protein structures programmatically, such as an alpha helix.\n\n```python\nfrom protein_design_tools.io.builder import build_ideal_alpha_helix\n\n# Build an idealized alpha helix with 10 residues\nideal_helix = build_ideal_alpha_helix(sequence_length=10, chain_id='A', start_res_seq=1)\n\n# Display sequence\nsequence_dict = ideal_helix.get_sequence_dict()\nprint(sequence_dict)\n```\n\n## Examples\n\n### Calculating the Radius of Gyration\n\nCalculate the radius of gyration for a protein and compare it to an idealized alpha helix.\n\n```python\nfrom protein_design_tools.core.protein_structure import ProteinStructure\nfrom protein_design_tools.io.pdb_io import read_pdb\nfrom protein_design_tools.metrics import compute_radgyr, compute_radgyr_ratio\n\n# Read the protein structure\nprotein = read_pdb(\"example.pdb\")\n\n# Display the amino acid sequence of the protein\nsequence_dict = protein.get_sequence_dict()\nfor chain_id, sequence in sequence_dict.items():\n    print(f\"Chain {chain_id}: {sequence}\")\n\n# Calculate the radius of gyration of the backbone of chain A\nrgA = compute_radgyr(protein, chains={'A'}, atom_type=\"backbone\")\nprint(f\"Protein Structure Chain A Radius of Gyration: {rgA:.4f}\")\n\n# Calculate the radius of gyration of an ideal alanine helix\nideal_helix_seq_length = len(sequence_dict['A'])\nrg_ideal_helix = compute_radgyr_alanine_helix(ideal_helix_seq_length, atom_type=\"backbone\")\nprint(f\"Ideal Alanine Helix Radius of Gyration: {rg_ideal_helix:.4f}\")\n\n# Calculate the radius of gyration ratio\nrg_ratio = compute_radgyr_ratio(protein, chains={'A'}, atom_type=\"backbone\")\nprint(f\"Radius of Gyration Ratio: {rg_ratio:.4f}\")\n```\n\n### Comparing Protein Structures Using TM-score\n\nAssess the structural similarity between two protein structures.\n\n```python\nfrom protein_design_tools.metrics import compute_tmscore_numpy\n\n# Assume P and Q are numpy arrays of shape (N, D) representing atom coordinates\nP = np.random.rand(1000, 3)\nQ = np.random.rand(1000, 3)\n\n# Compute TM-score using NumPy\ntm_score = compute_tmscore_numpy(P, Q)\nprint(f\"TM-score (NumPy): {tm_score:.4f}\")\n```\n\n### Contributing\n\nContributions are welcome! Whether you're fixing bugs, improving documentation, or adding new features, your help is greatly appreciated.\n\n1. **Fork the Repository**: Click the \"Fork\" button at the top right of the repository page.\n2. **Clone Your Fork**:\n    ```bash\n    git clone https://github.com/your-username/protein-design-tools.git\n    ```\n3. **Create a New Branch**:\n    ```bash\n    git checkout -b feature/YourFeatureName\n    ```\n4. **Make Your Changes**: Implement your feature or fix.\n5. **Commit Your Changes**:\n    ```bash\n    git commit -m \"Add feature: YourFeatureName\"\n    ```\n6. **Push to Your Fork**:\n    ```bash\n    git push origin feature/YourFeatureName\n    ```\n7. **Create a Pull Request**: Go to the original repository and create a pull request from your fork.\n\nFor major changes, please open an issue first to discuss what you would like to change.\n\n### Development Guidelines\n\n- Follow PEP8 style guidelines.\n- Write clear and concise docstrings for all functions and classes.\n- Include unit tests for new features or bug fixes.\n- Ensure that existing tests pass before submitting a pull request. \n\n## License\n\nThis project is licensed under the MIT License.\n\n## Contact\n\nFor any questions, suggestions, or contributions, please reach out:\n- **Author**: Andrew Schaub\n- **Linkedin**: https://www.linkedin.com/in/andrewjschaub\n- **GitHub**: https://github.com/drewschaub/protein-design-tools\n\n---\n\nThank you for using Protein-Design Tools! We hope it serves as a valuable resource in your structural bioinformatics and protein engineering endeavors.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdrewschaub%2Fprotein-design-tools","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdrewschaub%2Fprotein-design-tools","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdrewschaub%2Fprotein-design-tools/lists"}