https://github.com/oaklight/protein-score-server

a score inference server using protein folding models and FastAPI
https://github.com/oaklight/protein-score-server

alphafold3 async computational-biology esmfold protein-structure server

Last synced: 26 days ago
JSON representation

a score inference server using protein folding models and FastAPI

Host: GitHub
URL: https://github.com/oaklight/protein-score-server
Owner: Oaklight
License: apache-2.0
Created: 2024-09-04T22:54:21.000Z (almost 2 years ago)
Default Branch: master
Last Pushed: 2025-05-15T20:42:05.000Z (about 1 year ago)
Last Synced: 2025-09-18T14:13:28.453Z (10 months ago)
Topics: alphafold3, async, computational-biology, esmfold, protein-structure, server
Language: Python
Homepage: https://oaklight.github.io/protein-score-server/
Size: 74.7 MB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Protein Structure Score Server

## 1. Introduction

This server is a protein structure prediction tool that processes prediction requests from users and capable of returning various scores for protein sequences.

## 2. Installation

To install the environment, follow these steps:

```bash
git clone https://github.com/Oaklight/protein-score-server.git
cd protein-score-server
conda env create -f environment.yaml
conda activate esm
pip install -r requirements.txt
```

## 3. Server Configuration

**Configuration File**:
- Copy `server.yaml.sample` to `server.yaml` :

```bash
cp server.yaml.sample server.yaml
```

- Edit `server.yaml` with your settings.

The server uses the `server.yaml` file for configuration. Currently configurable items include:

* `api_key`: API key for Hugging Face Hub login.
* `history_path`: History result storage path.
* `intermediate_pdb_path`: Intermediate PDB file storage path.
* `model`: Model configuration
- `name`: model name, `esmfold` or `protenix (bytedances' alphafold3 implementation)`
- `replica`: GPU device and replications mapping, should be in `: ` format. For `esmfold` case, use `_: ` instead.
* `task_queue_size`: Task queue size, default to 50.
* `timeout`: Timeout for async prediction result retrieval, default to 15 seconds.
* `backbone_pdb`:
- `reversed_index`: path for reverse index from pdb id to pdb file path
- `parquet_prefix`: path prefix for parquet files
- `pdb_prefix`: path prefix for pdb files

For example, see [ `server.yaml` ](./server.yaml)

After the config are set, run these commands inside the project folder:

```bash
conda activate esm
uvicorn main:app --host 0.0.0.0 --port 8000
```

## 4. Usage

### 4.1. Request Prediction

Users can send `POST` requests to `http://your-host:8000/predict/` to get predictions. The request body comprises of these fields: `seq` , `name` , `type` , `seq2` .

* `seq`: String, representing the protein sequence.
* `name`: String, representing the name of the reference protein.
* `type`: String, representing the task type, currently supports **"plddt", "tmscore", "sc-tmscore", "pdb"**.
* `seq2`: String, representing the sequence of the reference protein. **Used only for `sc-tmscore` task. You may choose to provide either `seq2` or `name`**

1. **pLDDT**

```json
{
"seq": "MKRESHKHAEQARRNRLAVALHELASLIPAEWKQQNVSAAPSKATTVEAACRYIRHLQQNGST",
"type": "plddt"
}
```

2. **TMscore**

```json
{
"seq": "MKRESHKHAEQARRNRLAVALHELASLIPAEWKQQNVSAAPSKATTVEAACRYIRHLQQNGST",
"name": "1a0a.A", # must provide for tasks that require a reference structure
"type": "tmscore"
}
```

3. **sc-TMscore**

```json
{
"seq": "MKRESHKHAEQARRNRLAVALHELASLIPAEWKQQNVSAAPSKATTVEAACRYIRHLQQNGST",
"seq2": "MKRESHKHAEQARRNRLAVALHELASLIPAEWKQQNVSAAPSKATTVEAACRYIRHLQQNGST", # choose to provide either seq2 or name
"type": "sc-tmscore"
}
```

```json
{
"seq": "MKRESHKHAEQARRNRLAVALHELASLIPAEWKQQNVSAAPSKATTVEAACRYIRHLQQNGST",
"name": "1a0a.A", # choose to provide either seq2 or name
"type": "sc-tmscore"
}
```

4. **pdb**

```json
{
"seq": "MKRESHKHAEQARRNRLAVALHELASLIPAEWKQQNVSAAPSKATTVEAACRYIRHLQQNGST",
"type": "pdb"
}
```

The server will return a JSON response containing two fields: `job_id` and `prediction` .

* `job_id`: String, representing the task ID.
* `prediction`: String, currently only indicating the prediction is in processing.

```json
{
"job_id": "0a98a981748c4b7eacfd5e0957905ced", # this is a uuid4 hex string
"prediction": ... # not very useful at this moment
}
```

### 4.2. Result Retrieval

Users can send `GET` requests to `http://your-host:8000/result/{job_id}` to get prediction results. The header of the request should contain `Content-Type: application/json` .

The server will return a JSON response containing two fields: `job_id` and `prediction` .

```json
{
"job_id": "0a98a981748c4b7eacfd5e0957905ced", # this is a uuid4 hex string
"prediction": 0.983124
}
```

### 4.3. Error Handling

When querying for results, use the following guidelines based on the status code:
- **102 Processing**: The task is queued. Wait a few seconds before checking again.
- **202 Accepted**: The task is being processed. Wait a few seconds before checking again.
- **200 OK**: The task is complete. The result is available in the response.
- **404 Not Found**: The task ID is invalid. Check the ID or resubmit the task.
- **429 Too Many Requests**: The server is busy. Wait and try again later.

### 4.4. Retry Strategy

* Recommend to use an exponential backoff strategy with a base of 3 when querying for results.
* Example of querying is available in [`test.py`](test.py).

## 5. Server Shutdown

To stop the server, use `Ctrl+C` in the terminal where the server is running.

## 6. License

This server is licensed under the Apache License 2.0.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/oaklight/protein-score-server

Awesome Lists containing this project

README