https://github.com/jianzhnie/proteintransformer
ProteinTransformer is a toolkit using deep learning for protein function annotation
https://github.com/jianzhnie/proteintransformer
biology computing
Last synced: 7 months ago
JSON representation
ProteinTransformer is a toolkit using deep learning for protein function annotation
- Host: GitHub
- URL: https://github.com/jianzhnie/proteintransformer
- Owner: jianzhnie
- License: apache-2.0
- Created: 2022-04-15T09:50:49.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2023-03-14T08:47:30.000Z (over 2 years ago)
- Last Synced: 2025-02-14T11:53:05.313Z (9 months ago)
- Topics: biology, computing
- Language: Python
- Homepage:
- Size: 9.29 MB
- Stars: 3
- Watchers: 5
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# ProteinTransformer
[](https://www.oscs1024.com/project/jianzhnie/X-DeepGO?ref=badge_small)
ProteinTransformer is toolkit using deep learning for protein function annotation
[Toc]
## Installation
The sources for Deepfold can be downloaded from the `Github repo`.
You can either clone the public repository:
```bash
# clone project
git clone https://github.com/jianzhnie/X-DeepGO.git
# First, install dependencies
pip install -r requirements.txt
```
Once you have a copy of the source, you can install it with:
```bash
python setup.py install
```
## How to use
### Single GPU Training
```sh
python main.py \
--data_path ./protein \
--output-dir ./work_dir \
--lr 0.0001 \
--epochs 10 \
--batch-size 2 \
--log_wandb \
--workers 4
```
### Distributed Training
```sh
torchrun --nnodes=1 --nproc_per_node=2 --rdzv_id=0 main.py \
--data_path ./protein \
--output-dir ./work_dir \
--lr 0.0001 \
--epochs 10 \
--batch-size 2 \
--log_wandb \
--workers 4
```
### Evaluate
```sh
## evaluate diamond
python evaluate_diamondscore.py \
--train-data-file ./protein/train_data.pkl \
--test-data-file ./protein/test_data.pkl \
--diamond-scores-file ./protein/test_diamond.res \
--ontology-obo-file ./protein/go.obo \
--output_dir ./work_dir
## evaluate model
python evaluate_deepmodel.py \
--train-data-file ./protein/train_data.pkl \
--test-data-file ./protein/predictions.pkl \
--terms-file ./protein/terms.pkl \
--ontology-obo-file ./protein/go.obo \
--output_dir ./work_dir
```
### Inference
```sh
## inference
python inference_embedding.py \
--data_path ./protein \
--output-dir ./work_dir \
--resume ./work_dir/ProtLM_esm_embedding_mean/model_best.pth.tar \
--model esm_embedding \
--pool_mode mean \
--batch-size 128 \
--workers 4
```
### Extract Embeddings
```sh
python extract_embeddings.py \
--data_path ./protein \
--split "test" \
--batch-size 32
```
## License
This library is licensed under the Apache 2.0 License.
## Contributing to X-DeepGO
We are actively accepting code contributions to the X-DeepGO project. If you are interested in contributing to X-DeepGO, please contact me.