{"id":44964495,"url":"https://github.com/bioinfomachinelearning/transpro","last_synced_at":"2026-02-18T14:09:41.675Z","repository":{"id":104968779,"uuid":"501879985","full_name":"BioinfoMachineLearning/TransPro","owner":"BioinfoMachineLearning","description":"1D transformer for predicting protein structural features (secondary structure, solvent accessibility)","archived":false,"fork":false,"pushed_at":"2022-07-12T23:18:53.000Z","size":35028,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-09-09T16:34:17.743Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/BioinfoMachineLearning.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2022-06-10T02:55:16.000Z","updated_at":"2023-01-16T07:45:55.000Z","dependencies_parsed_at":null,"dependency_job_id":"81b66099-6f02-426e-9979-907c18a228d0","html_url":"https://github.com/BioinfoMachineLearning/TransPro","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/BioinfoMachineLearning/TransPro","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BioinfoMachineLearning%2FTransPro","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BioinfoMachineLearning%2FTransPro/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BioinfoMachineLearning%2FTransPro/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BioinfoMachineLearning%2FTransPro/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/BioinfoMachineLearning","download_url":"https://codeload.github.com/BioinfoMachineLearning/TransPro/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BioinfoMachineLearning%2FTransPro/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29581626,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-18T13:56:48.962Z","status":"ssl_error","status_checked_at":"2026-02-18T13:54:34.145Z","response_time":162,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-02-18T14:09:40.697Z","updated_at":"2026-02-18T14:09:41.670Z","avatar_url":"https://github.com/BioinfoMachineLearning.png","language":"Python","readme":"\u003cdiv align=\"center\"\u003e\n\n# TransPross: 1D transformer for predicting protein secondary structure prediction\n\n![TransPross Architecture](https://github.com/BioinfoMachineLearning/TransPro/blob/main/img/TransPross_Architecture.png)\n\n\u003c/div\u003e\n\n## Description\n1D transformer for predicting protein structural features (secondary structure)\n\n\n## Installation\n```bash\ngit clone https://github.com/BioinfoMachineLearning/TransPro.git\ncd TransPro\nmkdir env\npython3.6 -m venv env/ss_virenv\nsource env/ss_virenv/bin/activate\npip install --upgrade pip\npip install -r requirments.txt\n```\n\n## Training data\nThe training protein targets were extracted from the Protein Data Bank(PDB) before May 2019 with the the sequence identity \u003c 90%. The sequence length range: [50, 500]\n\nAll the required data for training are provided as below and avaiable at [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.6762376.svg)](https://doi.org/10.5281/zenodo.6762376):\n* Protein sequences in fasta file (fasta.tar.gz)\n* Target id list for training\n* MSA in a3m file (a3m.tar.gz is too large, stored at /bml/TransPro/a3m.tar.gz)\n* True ss labels in 3 states (ss_3.tar.gz)\n* True 3D structures in pdb file (atom.tar.gz)\n* 5 trained TransPross models (model.tar.gz)\n\n## Testing data\nAll the testing data for evaluation are provided as below:\n* CASP test sets(CASP13, CASP14)\n\n## Training\n```bash\npython MSA_transformer2_train.py --model_num 1 --N 6 --max_positions 1500  --BATCH_SIZE 5 --data_dir \u003ctrain\u003e --dataset \u003ccustom\u003e\n\nmodel_num: training list model\nN: number of attention layers\nmax_positions: maximum number of sequences allowed in the input MSA\nBATCH_SIZE: batch size\ndata_dir: folder path for storing data\ndataset: training set name\n```\n## Inference\n**Predicting with the single a3m file as the input:**\n```bash\npython MSA_transformer2_predict_batch.py -i \u003ca3m_file\u003e\ne.g. python MSA_transformer2_predict_batch.py -i T1026.a3m\n```\n\n**Predicting multiple targets in one time:**\n```bash\npython MSA_transformer2_predict_batch.py --data_dir \u003ctest\u003e --dataset \u003ccasp13\u003e\n\nIf you want to predict multiple targets, you can create a test.lst file under the path /data_dir/dataset/test.lst in the format: \u003ctarget_id\u003e length\ne.g test/casp13/test.lst\n\ndata_dir: folder path for storing data\ndataset: testing set name\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbioinfomachinelearning%2Ftranspro","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbioinfomachinelearning%2Ftranspro","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbioinfomachinelearning%2Ftranspro/lists"}