An open API service indexing awesome lists of open source software.

https://github.com/kavrakilab/ape-gen


https://github.com/kavrakilab/ape-gen

Last synced: 4 months ago
JSON representation

Awesome Lists containing this project

README

          

# APE-Gen: Anchored Peptide-MHC Ensemble Generator

## Description:

- APE-Gen is a tool that generates multiple clash-free conformations of a peptide bound to an MHC

- All that is required for input is the sequence of the peptide and the MHC allotype name (if supported)

- Minimal example: `python APE_Gen.py QFKDNVILL HLA-A*24:02`

- For a peptide with "n" residues, APE-Gen will use a template of another peptide with "n" residues to place the anchor residues in the correct pocket of the MHC
- to change the template, modify `n-mer-templates.txt` for the corresponding n-mer, then add the template pdb file (peptide and MHC) to the `templates/` folder

- A receptor model is also needed to run APE-Gen
- if the allotype of the MHC is supported (found in `receptor-class-templates.txt`), then simply specify the name as input into APE-Gen
- if the allotype is not supported, but one has the PDB file of the receptor
- place the PDB file inside the `templates/` folder
- modify `receptor-class-templates.txt` with an appropriate name
- then call APE-Gen with the chosen name
- One way to obtain a model of the receptor is to use the provided scripts in `modeller_scripts/` for homology modelling (see below)

- PDB File preparation:
- For PDB files to be used as input anywhere in the script, the chains must be labelled in a particular way
- chain A is the heavy alpha chain, chain B is the light beta-immunoglobulin chain, chain C is the peptide

- Specifying receptor degrees-of-freedom
- modify `flex_res.txt` to add/remove residues that are allowed to be flexible during the SMINA minimization step

- `dunbrack.bin` and `loco.score` are files required for the RCD step
- `align.py` is required for using PYMOL for alignment of PDB files
- `minimize.py` is needed for further minimization using OpenMM

## Output

- Each round of APE-Gen is saved within a folder with the index of the round (counting from 0)

- `full_system_confs/` contains the ensemble of conformations of peptide and MHC after energy refinement and filtering
- each conformation is named by the index of the loop as generated by RCD
- not every conformation makes it past the energy refinement and filtering step
- `peptide_confs.pdb` contains only the peptide conformations
- `filtered_energies.npz` is a numpy file that contains the energies of the ensemble according to the SMINA scoring function
- it contains two arrays of the same size: `filtered_indices` which contains indices of each conformation and `filtered_energies` which contains the corresponding energies

## Helper scripts

- `get_pMHC_pdb.py`
- usage: `python get_pMHC_pdb.py `
- assumes pdb code is of a peptide-MHC structure
- adds missing atoms/residues, removes all waters and ions, labels chains as A,B,C where chain C is the peptide

- `mutate.py`
- Usage: `~/pymol/bin/pymol -qc mutate.py `
- example: `~/pymol/bin/pymol -qc mutate.py 0.pdb C/1/ ALA 0_mutated.pdb`

## Options help:

```
usage: APE_Gen.py [-h] [-n NUM_CORES] [-l NUM_LOOPS] [-t RCD_DIST_TOL] [-r]
[-d] [-p] [-a ANCHOR_TOL] [-o] [-g NUM_ROUNDS]
[-b {receptor_only,pep_and_recept}] [-s] [--use_gpu]
[--no_progress] [--clean_rcd]
peptide_input receptor_class

Anchored Peptide-MHC Ensemble Generator

positional arguments:
peptide_input Sequence of peptide to dock or pdbfile of crystal
structure
receptor_class Class descriptor of MHC receptor. Use REDOCK along
with crystal input to perform redocking. Or pass a PDB
file with receptor

optional arguments:
-h, --help show this help message and exit
-n NUM_CORES, --num_cores NUM_CORES
Number of cores to use for RCD and smina computations.
(default: 8)
-l NUM_LOOPS, --num_loops NUM_LOOPS
Number of loops to generate with RCD. (Note that the
final number of sampled conformations may be less due
to steric clashes. (default: 100)
-t RCD_DIST_TOL, --RCD_dist_tol RCD_DIST_TOL
RCD tolerance (in angstroms) of inner residues when
performing IK (default: 1.0)
-r, --rigid_receptor Disable sampling of receptor degrees of freedom
specified in flex_res.txt (default: False)
-d, --debug Print extra information for debugging (default: False)
-p, --save_only_pep_confs
Disable saving full conformations (peptide and MHC)
(default: False)
-a ANCHOR_TOL, --anchor_tol ANCHOR_TOL
Anchor tolerance (in angstroms) of first and last
backbone atoms of peptide when filtering (default:
2.0)
-o, --score_with_openmm
Rescore full conformations with openmm (AMBER)
(default: False)
-g NUM_ROUNDS, --num_rounds NUM_ROUNDS
Number of rounds to perform. (default: 1)
-b {receptor_only,pep_and_recept}, --pass_type {receptor_only,pep_and_recept}
When using multiple rounds, pass best scoring
conformation across different rounds (choose either
'receptor_only' or 'pep_and_recept') (default:
receptor_only)
-s, --min_with_smina Minimize with SMINA instead of the default Vinardo
(default: False)
--use_gpu Use GPU for OpenMM Minimization step (default: False)
--no_progress Do not print progress bar (default: False)
--clean_rcd Remove RCD folder at the end of each round (default:
False)
```

## Installation instructions:

1) install miniconda
- https://conda.io/miniconda.html

2) using conda, install the following
- `conda install -c bioconda smina`
- `conda install -c omnia pdbfixer`
- `conda install -c conda-forge mdtraj`
- `conda install -c schrodinger pymol`
- `conda install -c bioconda autodock-vina`
- `conda install -c omnia -c conda-forge openmm`

3) install RCD (v1.40)
- http://chaconlab.org/modeling/rcd/rcd-download
- make sure RCD is added to path so that `rcd` is a command in the terminal
- intel mkl may be needed (`conda install -c intel mkl`) and added to library path

## Using Modeller script

```
usage: model_receptor.py [-h] [-n NUM_MODELS] alpha_chain_seq template

Homology Modeling of HLAs using Modeller

positional arguments:
alpha_chain_seq Fasta file (.fasta) containing the sequence of the
alpha chain of HLA or name of HLA allele (ex.
HLA-A*02:01). If allele name given, the program will
try to download the sequence from EMBL-EBI.
template PDB of the template HLA or name of HLA allele (ex.
HLA-A*02:01). If allele name given, a template based
on the allele's supertype (as defined in
supertype_templates.csv) will be chosen.

optional arguments:
-h, --help show this help message and exit
-n NUM_MODELS, --num_models NUM_MODELS
Number of models to sample with Modeller (default: 10)
```

- Requires Modeller, Biopython, and BeautifulSoup4
- `conda install -c salilab modeller`
- `conda install -c conda-forge biopython`
- `conda install -c anaconda beautifulsoup4`
- `conda install -c anaconda requests`
- Model with the best DOPE score is found in `best_model.pdb`
- Example: `python model_receptor.py P01892.fasta 3I6L.pdb`
- Requires license key
- models HLA-A*02:01 using 3I6L as a template
- 3I6L contains a model of HLA-A*24:02
- Even simpler example: `python model_receptor.py HLA-A*02:01 HLA-A*02:01`
- First argument says that the sequence of `HLA-A*02:01` will be downloaded
- Second argument says that a template PDB will be downloaded based on a representative allele from the same supertype classification

## Instructions to run Minimal Example with Docker

- Open a Terminal
- Pull image from Docker hub: `docker pull jayab867/apegen:v2.0`
- Go to the directory where you would like the APE-Gen results to be saved
- Create a container that links the current working directory to a directory in the container called `/data`
- `docker run -it --rm -v $(pwd):/data --workdir "/data" jayab867/apegen:v2.0`
- Run APE-Gen: `python /APE-Gen/APE_Gen.py QFKDNVILL HLA-A*24:02`
- Exit the container with Ctrl-D
- There will be a number of folders which contain the results for each round of APE-Gen
- Default is one round: so a single folder in this example called `0/`
- In each folder:
- The best scoring conformation for each round is called `min_energy_system.pdb`
- The whole ensemble generated is located in `full_system_confs/`