https://github.com/oaklight/protein-score-server
a score inference server using protein folding models and FastAPI
https://github.com/oaklight/protein-score-server
alphafold3 async computational-biology esmfold protein-structure server
Last synced: 5 days ago
JSON representation
a score inference server using protein folding models and FastAPI
- Host: GitHub
- URL: https://github.com/oaklight/protein-score-server
- Owner: Oaklight
- License: apache-2.0
- Created: 2024-09-04T22:54:21.000Z (almost 2 years ago)
- Default Branch: master
- Last Pushed: 2025-05-15T20:42:05.000Z (about 1 year ago)
- Last Synced: 2025-09-18T14:13:28.453Z (9 months ago)
- Topics: alphafold3, async, computational-biology, esmfold, protein-structure, server
- Language: Python
- Homepage: https://oaklight.github.io/protein-score-server/
- Size: 74.7 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Protein Structure Score Server
## 1. Introduction
This server is a protein structure prediction tool that processes prediction requests from users and capable of returning various scores for protein sequences.
## 2. Installation
To install the environment, follow these steps:
```bash
git clone https://github.com/Oaklight/protein-score-server.git
cd protein-score-server
conda env create -f environment.yaml
conda activate esm
pip install -r requirements.txt
```
## 3. Server Configuration
**Configuration File**:
- Copy `server.yaml.sample` to `server.yaml` :
```bash
cp server.yaml.sample server.yaml
```
- Edit `server.yaml` with your settings.
The server uses the `server.yaml` file for configuration. Currently configurable items include:
* `api_key`: API key for Hugging Face Hub login.
* `history_path`: History result storage path.
* `intermediate_pdb_path`: Intermediate PDB file storage path.
* `model`: Model configuration
- `name`: model name, `esmfold` or `protenix (bytedances' alphafold3 implementation)`
- `replica`: GPU device and replications mapping, should be in `: ` format. For `esmfold` case, use `_: ` instead.
* `task_queue_size`: Task queue size, default to 50.
* `timeout`: Timeout for async prediction result retrieval, default to 15 seconds.
* `backbone_pdb`:
- `reversed_index`: path for reverse index from pdb id to pdb file path
- `parquet_prefix`: path prefix for parquet files
- `pdb_prefix`: path prefix for pdb files
For example, see [ `server.yaml` ](./server.yaml)
After the config are set, run these commands inside the project folder:
```bash
conda activate esm
uvicorn main:app --host 0.0.0.0 --port 8000
```
## 4. Usage
### 4.1. Request Prediction
Users can send `POST` requests to `http://your-host:8000/predict/` to get predictions. The request body comprises of these fields: `seq` , `name` , `type` , `seq2` .
* `seq`: String, representing the protein sequence.
* `name`: String, representing the name of the reference protein.
* `type`: String, representing the task type, currently supports **"plddt", "tmscore", "sc-tmscore", "pdb"**.
* `seq2`: String, representing the sequence of the reference protein. **Used only for `sc-tmscore` task. You may choose to provide either `seq2` or `name`**
1. **pLDDT**
```json
{
"seq": "MKRESHKHAEQARRNRLAVALHELASLIPAEWKQQNVSAAPSKATTVEAACRYIRHLQQNGST",
"type": "plddt"
}
```
2. **TMscore**
```json
{
"seq": "MKRESHKHAEQARRNRLAVALHELASLIPAEWKQQNVSAAPSKATTVEAACRYIRHLQQNGST",
"name": "1a0a.A", # must provide for tasks that require a reference structure
"type": "tmscore"
}
```
3. **sc-TMscore**
```json
{
"seq": "MKRESHKHAEQARRNRLAVALHELASLIPAEWKQQNVSAAPSKATTVEAACRYIRHLQQNGST",
"seq2": "MKRESHKHAEQARRNRLAVALHELASLIPAEWKQQNVSAAPSKATTVEAACRYIRHLQQNGST", # choose to provide either seq2 or name
"type": "sc-tmscore"
}
```
or
```json
{
"seq": "MKRESHKHAEQARRNRLAVALHELASLIPAEWKQQNVSAAPSKATTVEAACRYIRHLQQNGST",
"name": "1a0a.A", # choose to provide either seq2 or name
"type": "sc-tmscore"
}
```
4. **pdb**
```json
{
"seq": "MKRESHKHAEQARRNRLAVALHELASLIPAEWKQQNVSAAPSKATTVEAACRYIRHLQQNGST",
"type": "pdb"
}
```
The server will return a JSON response containing two fields: `job_id` and `prediction` .
* `job_id`: String, representing the task ID.
* `prediction`: String, currently only indicating the prediction is in processing.
```json
{
"job_id": "0a98a981748c4b7eacfd5e0957905ced", # this is a uuid4 hex string
"prediction": ... # not very useful at this moment
}
```
### 4.2. Result Retrieval
Users can send `GET` requests to `http://your-host:8000/result/{job_id}` to get prediction results. The header of the request should contain `Content-Type: application/json` .
The server will return a JSON response containing two fields: `job_id` and `prediction` .
```json
{
"job_id": "0a98a981748c4b7eacfd5e0957905ced", # this is a uuid4 hex string
"prediction": 0.983124
}
```
### 4.3. Error Handling
When querying for results, use the following guidelines based on the status code:
- **102 Processing**: The task is queued. Wait a few seconds before checking again.
- **202 Accepted**: The task is being processed. Wait a few seconds before checking again.
- **200 OK**: The task is complete. The result is available in the response.
- **404 Not Found**: The task ID is invalid. Check the ID or resubmit the task.
- **429 Too Many Requests**: The server is busy. Wait and try again later.
### 4.4. Retry Strategy
* Recommend to use an exponential backoff strategy with a base of 3 when querying for results.
* Example of querying is available in [`test.py`](test.py).
## 5. Server Shutdown
To stop the server, use `Ctrl+C` in the terminal where the server is running.
## 6. License
This server is licensed under the Apache License 2.0.