{"id":43977686,"url":"https://github.com/bzhanglab/deeprescore","last_synced_at":"2026-02-07T08:34:03.621Z","repository":{"id":45461244,"uuid":"236129121","full_name":"bzhanglab/DeepRescore","owner":"bzhanglab","description":"DeepRescore: rescore PSMs leveraging deep learning-derived peptide features","archived":false,"fork":false,"pushed_at":"2024-05-10T20:35:04.000Z","size":81104,"stargazers_count":7,"open_issues_count":4,"forks_count":3,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-09-09T20:57:50.971Z","etag":null,"topics":["deep-learning","machine-learning","msms","peptide-identification","proteomics"],"latest_commit_sha":null,"homepage":"","language":"Nextflow","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bzhanglab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-01-25T05:18:54.000Z","updated_at":"2025-05-14T12:26:18.000Z","dependencies_parsed_at":"2022-07-14T14:17:19.832Z","dependency_job_id":null,"html_url":"https://github.com/bzhanglab/DeepRescore","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/bzhanglab/DeepRescore","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bzhanglab%2FDeepRescore","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bzhanglab%2FDeepRescore/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bzhanglab%2FDeepRescore/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bzhanglab%2FDeepRescore/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bzhanglab","download_url":"https://codeload.github.com/bzhanglab/DeepRescore/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bzhanglab%2FDeepRescore/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29190258,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-07T07:37:03.739Z","status":"ssl_error","status_checked_at":"2026-02-07T07:37:03.029Z","response_time":63,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","machine-learning","msms","peptide-identification","proteomics"],"created_at":"2026-02-07T08:34:02.975Z","updated_at":"2026-02-07T08:34:03.615Z","avatar_url":"https://github.com/bzhanglab.png","language":"Nextflow","funding_links":[],"categories":[],"sub_categories":[],"readme":"# [DeepRescore](https://doi.org/10.1002/pmic.201900334)\n## Overview\n**DeepRescore** is an immunopeptidomics data analysis tool that leverages deep learning-derived peptide features to rescore peptide-spectrum matches (PSMs). DeepRescore takes as input MS/MS data in MGF format and identification results from a search engine. The current version supports four search engines, [MS-GF+](https://github.com/MSGFPlus/msgfplus), [Comet](http://comet-ms.sourceforge.net/), [X!Tandem](https://www.thegpm.org/TANDEM/), and [MaxQuant](https://maxquant.org/).\n\n## Installation\n1. Download DeepRescore:\n```sh\ngit clone https://github.com/bzhanglab/DeepRescore\n```\n2. Install [Docker](https://docs.docker.com/install/) (\u003e=19.03).\n\n3. Install [Nextflow](https://www.nextflow.io/docs/latest/getstarted.html). More information can be found in the Nextflow [get started](https://www.nextflow.io/docs/latest/getstarted.html) page.\n\n4. Install [nvidia-docker](https://github.com/NVIDIA/nvidia-docker) (\u003e=2.2.2) for [**AutoRT**](https://github.com/bzhanglab/AutoRT/)  and [**pDeep2**](https://github.com/pFindStudio/pDeep/tree/master/pDeep2) by following the instruction at [https://github.com/NVIDIA/nvidia-docker](https://github.com/NVIDIA/nvidia-docker). **Please note GPU is required to run DeepRescore**.\n\nAll other tools used by DeepRescore have been dockerized and will be automatically installed when DeepRescore is run in the first time on a computer. DeepRescore has been tested on Linux.\n\n## Usage\n\n```sh\n○ → nextflow run DeepRescore.nf --help\nN E X T F L O W  ~  version 19.10.0\nLaunching `deeprescore.nf` [special_hamilton] - revision: 2817bc64da\n=========================================\nDeepRescore =\u003e Rescore PSMs\n=========================================\nUsage:\nnextflow run DeepRescore.nf\nArguments:\n  --id_file              Identification result.\n  --ms_file              MS/MS data in MGF format. If the search engine is MaxQuant, this parameter is not useful.\n  --se                   The name of search engine, msgf:MS-GF+, xtandem:X!Tandem, comet:Comet or maxquant:MaxQuant.\n                         Default is \"msgf\" (MS-GF+).\n  --ms_instrument        The MS instrument used to generate the MS/MS data. \n                         This is used by pDeep2 for MS/MS spectrum prediction. Default is \"Lumos\".\n  --ms_energy            The energy used in MS/MS data generation. \n                         This is used by pDeep2 for MS/MS spectrum prediction. Default is 0.34.\n  --out_dir              Output folder, default is \"./output\"\n  --prefix               The prefix of output file(s).\n  --decoy_prefix         The prefix of decoy proteins. Default is \"XXX_\".\n  --cpu                  The number of CPUs\n  --mem                  The memory for processing the data, default is 8. The unit is G.\n  --help                 Print help message\n\n```\n\n### Input\nIn general, the main inputs to run DeepRescore are identification result from one of the four search engines (MS-GF+, X!Tandem, Comet and MaxQuant) and the MS/MS data used for searching. If the identification software is MaxQuant, then the MS/MS data is not needed because MS/MS data is included in MaxQuant search result ( folder ``combined``, **mqpar.xml** is also required to be present in the ``combined`` folder). Below is the table showing the detailed search result format and MS/MS data format supported for each search engine. Using MS-GF+, X!Tandem or Comet, raw MS/MS data must be converted to MGF format using [ProteoWizard](http://www.proteowizard.org/). Multiple MGF files (different fractions) from the sample or same TMT/iTRAQ experiment should be combined into one MGF file. **Only oxidation of M is supported as variable modification**. Please note if DeepRescore is used to rescore MaxQuant result, the FDR cutoff should be set as 100% when performing the MaxQuant search, otherwise target PSMs may be filtered by MaxQuant's FDR calculation before rescoring using DeepRescore.\n\nFor MGF file conversion, we recommend to use the following command line:\n\n```sh\nmsconvert --filter \"peakPicking true 1-2\" --mgf *.raw\n```\n\n| Search engine | Identification format | MS/MS data format |\n|---|---|---|\n| Comet | .pepxml | MGF |\n| MS-GF+ | .mzid | MGF |\n| X!Tandem | .xml | MGF |\n| MaxQuant | /combined/ | - |\n\nBelow is an example:\n```sh\nnextflow run DeepRescore.nf --id_file ./example_data/A1101.pep.xml \\\n\t--ms_file ./example_data/A1101.mgf \\\n\t--se comet \\\n\t--ms_instrument Lumos \\\n\t--ms_energy 0.34 \\\n\t--out_dir out \\\n\t--prefix d2 \\\n\t--decoy_prefix XXX_ \\\n\t--cpu 4 \\\n\t--mem 8\n```\nIt took about one and half hour to run the example on a Linux server (12 threads, 64 RAM, GPU: TITAN Xp). The example data can be downloaded through this link: [test_data](http://pdv.zhang-lab.org/data/download/deeprescore/example_data.tar.gz).\n\n### Output\nThe final output data can be found in this folder `out_dir/DeepRescore_results`. Here `out_dir` is the output directory specified through parameter `--out_dir`. There are two files in this folder: `*_psm_final.tsv` and `*_pep_final.tsv`. The first one is the result controled FDR at PSM level and the second one is the result controled FDR at peptide level. Below is an example of `*_psm_final.tsv`. The format of `*_pep_final.tsv` is the same with `*_psm_final.tsv`. Users can filter the result based on the column `q-value` (for example, q-value \u003c= 0.01). The result files (`*_psm_final.tsv` or `*_pep_final.tsv` + `the MS/MS data in MGF format`) can be imported into [**PDV**](https://github.com/wenbostar/PDV) for visualization.\n\n| spectrum_title                                           | Percolator_score | q_value     | modification                  | Mod_Sequence             | Label | RT     | Mass             | Abs_Mass_Error | Ln_Total_Intensity | Match_Ions_Intensity | Rel_Match_Ions_Intensity | Max_Match_Ion_Intensity | Score  | Pep         | Delta_Score | charge | peptide                  | Proteins              | Delta_RT          | SA                | mz               |\n|----------------------------------------------------------|------------------|-------------|-------------------------------|--------------------------|-------|--------|------------------|----------------|--------------------|----------------------|--------------------------|-------------------------|--------|-------------|-------------|--------|--------------------------|-----------------------|-------------------|-------------------|------------------|\n| YE_20180517_SK_HLA_A1101_3Ips_a50mio_R1_02.25098.25098.2 | 1.43274          | 5.43478e-05 | Carbamidomethyl of C@23[0.0]; | QVADEGDALVAGGVSQTPSYLSCK | 1     | 59.187 | 2451.16256793088 | 0              | 14.0435918544821   | 11.9698824268126     | 0.125718571751381        | 24026.38671875          | 339.58 | 4.3576e-114 | 283.38      | 2      | QVADEGDALVAGGVSQTPSYLSCK | uc003kfu.4            | 0.130932000000001 | 0.775038384865853 | 1226.58908396544 |\n| YE_20180517_SK_HLA_A1101_3IPs_a50mio_R1_01.21936.21936.3 | 1.36464          | 5.43478e-05 | -                             | PLFVNVNDQTNEGIMHESK      | 1     | 52.926 | 2171.03328724    | 0              | 16.183782828437    | 15.7096998670651     | 0.622455610654202        | 520227.46875            | 268.55 | 1.5772e-35  | 235.24      | 3      | PLFVNVNDQTNEGIMHESK      | uc010fur.3;uc002vee.4 | 0.23695           | 0.669018716842519 | 724.685562413335 |\n| YE_20180517_SK_HLA_A1101_3Ips_a50mio_R2_01.21952.21952.3 | 1.34864          | 5.43478e-05 | -                             | PLFVNVNDQTNEGIMHESK      | 1     | 52.495 | 2171.03151690128 | 0              | 14.6959014960537   | 14.1948714045837     | 0.605906199334659        | 119562.108398438        | 284.15 | 1.4821e-46  | 248.85      | 3      | PLFVNVNDQTNEGIMHESK      | uc010fur.3;uc002vee.4 | 0.855877999999997 | 0.677688435645182 | 724.684972300427 |\n| AC20171011_Broad_HLA_A1101_R1_Rep01.3055.3055.3          | 1.32707          | 5.43478e-05 | -                             | RTLDAKMPRK               | 1     | 11.999 | 1214.69196592591 | 0              | 15.3894885906454   | 14.7015468520804     | 0.502609506923564        | 815314.75               | 201.84 | 0.0048553   | 201.84      | 3      | RTLDAKMPRK               | uc003lvo.4;uc021ygh.2 | 0.168488          | 0.884456468503026 | 405.905121975303 |\n| YE_20180517_SK_HLA_A1101_3IPs_a50mio_R1_01.19078.19078.2 | 1.29334          | 5.43478e-05 | -                             | GILAADESVGTMGNR          | 1     | 46.712 | 1489.71904591213 | 0              | 15.0714146779268   | 14.7645814265636     | 0.735773279870115        | 423094.538085938        | 305.7  | 1.1819e-36  | 222.7       | 2      | GILAADESVGTMGNR          | uc004bbk.2            | 0.180505999999994 | 0.862802817911802 | 745.867322956064 |\n\n\n\n## How to cite:\n\nKai Li, Antrix Jain, Anna Malovannaya, Bo Wen, Bing Zhang (2020), **DeepRescore: Leveraging Deep Learning to Improve Peptide Identification in Immunopeptidomics**. *Proteomics*. [doi:10.1002/pmic.201900334](https://doi.org/10.1002/pmic.201900334)\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbzhanglab%2Fdeeprescore","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbzhanglab%2Fdeeprescore","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbzhanglab%2Fdeeprescore/lists"}