{"id":13774198,"url":"https://github.com/TranslationalBioinformaticsIGTP/CNVbenchmarkeR","last_synced_at":"2025-05-11T06:32:27.683Z","repository":{"id":37412382,"uuid":"135313177","full_name":"TranslationalBioinformaticsIGTP/CNVbenchmarkeR","owner":"TranslationalBioinformaticsIGTP","description":"Framework to benchmark algorithms when detecting germline copy number variations (CNVs) from NGS data","archived":false,"fork":false,"pushed_at":"2020-11-13T11:30:36.000Z","size":70,"stargazers_count":13,"open_issues_count":1,"forks_count":3,"subscribers_count":4,"default_branch":"master","last_synced_at":"2024-02-15T09:33:32.047Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/TranslationalBioinformaticsIGTP.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-05-29T15:03:05.000Z","updated_at":"2022-06-21T08:48:05.000Z","dependencies_parsed_at":"2022-08-18T20:20:57.461Z","dependency_job_id":null,"html_url":"https://github.com/TranslationalBioinformaticsIGTP/CNVbenchmarkeR","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TranslationalBioinformaticsIGTP%2FCNVbenchmarkeR","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TranslationalBioinformaticsIGTP%2FCNVbenchmarkeR/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TranslationalBioinformaticsIGTP%2FCNVbenchmarkeR/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TranslationalBioinformaticsIGTP%2FCNVbenchmarkeR/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/TranslationalBioinformaticsIGTP","download_url":"https://codeload.github.com/TranslationalBioinformaticsIGTP/CNVbenchmarkeR/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225022020,"owners_count":17408539,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-03T17:01:24.574Z","updated_at":"2025-05-11T06:32:27.676Z","avatar_url":"https://github.com/TranslationalBioinformaticsIGTP.png","language":"R","funding_links":[],"categories":["Variant Callers"],"sub_categories":["CNV Callers"],"readme":"**NOTE**: New version supporting 12 tools (CNVbenchmarkeR2) can be found [here](https://github.com/jpuntomarcos/CNVbenchmarkeR2). \n\n# CNVbenchmarkeR #\n\nCNVbenchmarkeR is a framework to benchmark algorithms when detecting germline copy number variations (CNVs) against different NGS datasets. Current version supports DECoN, CoNVaDING, panelcn.MOPS, ExomeDepth and CODEX2 tools.\n\nIt is part of our [publication](https://www.nature.com/articles/s41431-020-0675-z) in which we performed a benchmark of germline CNV calling tools for targeted gene-panel data. Citation:\nMoreno-Cabrera, J.M., del Valle, J., Castellanos, E. et al. Evaluation of CNV detection tools for NGS panel data in genetic diagnostics. Eur J Hum Genet (2020). https://doi.org/10.1038/s41431-020-0675-z\n\n### Prerequisites ###\n\nAlgorithms have to be properly installed. Links for algorithms installation:\n\n- https://github.com/bioinf-jku/panelcn.mops\n- https://molgenis.gitbooks.io/convading/\n- https://github.com/RahmanTeam/DECoN\n- https://github.com/yuchaojiang/CODEX2\n- https://cran.r-project.org/web/packages/ExomeDepth/index.html\n\n\n### How to use\n1. Get Code\n```\ngit clone https://github.com/TranslationalBioinformaticsIGTP/CNVbenchmarkeR \n```\n\n2. **Configure algorithms.yaml** to set which algortithms will be benchmarked. In case of executing DECoN, modify algorithms/decon/deconParams.yaml by setting deconFolder to your DECoN folder installation. In case of executing CoNVaDING, modify algorithms/convading/convadingParams.yaml by setting convadingFolder param.\n\n3. **Configure datasets.yaml** to define against which datasets the algorithms will be executed. Within this file, it is important to provide files with the exact expected format (**special attention** to `validated_results_file` and `bed_file` that are **tab-delimited** files). To do so, please **check the [examples](https://github.com/TranslationalBioinformaticsIGTP/CNVbenchmarkeR/tree/master/examples) folder**.\n\n\n4. Launch CNVbenchmarker\n```\ncd CNVbenchmarkerR\n./runBenchmark.sh\n```\n\n\n### Output ###\n\nA summary file and a .csv results file will be generated at output/summary folder. Stats include sensitivity, specificity, no-call rate, precision (PPV), NPV, F1, MCC and kappa coefficient.\n\nStats are calculated per ROI, per gene and at whole strategy level (gene level including no-calls, i. e., low quality regions)\n\nLogs files will be generated at logs folder. Output for each algorithm and dataset will be generated at output folder.\n\n\n### Troubleshooting  ###\n\nTwo important checks to ensure that metrics are computed correctly:\n\n- The **sample names in the `validated_results_file` should match the file names of your bam files** (excluding the .bam extension). For example, if the `validated_results_file` contains sample names like mySample2312, your bam files should have file names like mySample2312.bam .\n- Provide and use chromosomes names with the same format, that is, do not use \"chr5\" and \"5\" in you bed and `validated_results_file` files, for example.\n\n\n## Extra feature: optimizer ##\n\nAn optimizer is also attached in the framework. It executes a CNV calling algorithm against a dataset with many different values for each param.\nUp to 22 values are evaluated for each param. It is implemented using a greedy algorithm which starts from each different param. The CNV algorithm will be executed a maximum of (n_params^2)\\*22 times. \n\nIt will be improve sensitivity allowing drops of specificity defined at optimizerParams.yaml.\n\n\n### Prerequisites ###\n\nAn SGE cluster system has to be available.\n\n### How to use\n\n1. Configure optimizers/optimizerParams.yaml by defining optimizer params, dataset and algorithm to be optimized. Note: it is recommended to optimize over a random subset (training subset) of the original subset. Then, performance can be compared on the validation subset.\n2. Execute optimizer:\n```\ncd optimizers\nRscript optimizer.r optimizerParams.yaml\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FTranslationalBioinformaticsIGTP%2FCNVbenchmarkeR","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FTranslationalBioinformaticsIGTP%2FCNVbenchmarkeR","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FTranslationalBioinformaticsIGTP%2FCNVbenchmarkeR/lists"}