{"id":19787922,"url":"https://github.com/drostlab/rdiamond","last_synced_at":"2026-03-01T16:31:05.350Z","repository":{"id":83417365,"uuid":"253858477","full_name":"drostlab/rdiamond","owner":"drostlab","description":"Running DIAMOND2 through R","archived":false,"fork":false,"pushed_at":"2023-10-17T13:29:26.000Z","size":643,"stargazers_count":7,"open_issues_count":0,"forks_count":2,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-06-26T05:06:09.189Z","etag":null,"topics":["aligner","diamond","sequence","sequence-alignment"],"latest_commit_sha":null,"homepage":"https://drostlab.github.io/rdiamond/","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/drostlab.png","metadata":{"files":{"readme":"README.md","changelog":"NEWS.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-04-07T16:59:59.000Z","updated_at":"2025-02-18T14:52:20.000Z","dependencies_parsed_at":"2023-10-15T08:48:06.223Z","dependency_job_id":"b7da9a8f-8c90-485e-89e9-1b850b9e74dc","html_url":"https://github.com/drostlab/rdiamond","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/drostlab/rdiamond","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/drostlab%2Frdiamond","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/drostlab%2Frdiamond/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/drostlab%2Frdiamond/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/drostlab%2Frdiamond/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/drostlab","download_url":"https://codeload.github.com/drostlab/rdiamond/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/drostlab%2Frdiamond/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29974745,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-01T16:18:32.386Z","status":"ssl_error","status_checked_at":"2026-03-01T16:18:04.258Z","response_time":124,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aligner","diamond","sequence","sequence-alignment"],"created_at":"2024-11-12T06:25:07.372Z","updated_at":"2026-03-01T16:31:05.028Z","avatar_url":"https://github.com/drostlab.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"# rdiamond \u003cimg src=\"man/figures/logo.png\" align=\"right\" height=\"174\" width=\"150\" /\u003e\n\n![Visitors](https://api.visitorbadge.io/api/visitors?path=https%3A%2F%2Fgithub.com%2Fdrostlab%2Frdiamond\u0026label=VISITORS\u0026countColor=%23263759\u0026style=flat)\n\n## Seamless Integration of [DIAMOND2](https://github.com/bbuchfink/diamond)  Sequence Searches in R\n\n### Motivation \n\nWe are excited to introduce [DIAMOND2](https://www.nature.com/articles/s41592-021-01101-x), a cutting-edge pairwise protein aligner tailored to meet the extensive demands of the [Earth BioGenome Project](https://www.earthbiogenome.org/) and other expansive genomics initiatives. [DIAMOND2](https://github.com/bbuchfink/diamond) is a groundbreaking software solution designed to accelerate `BLAST` searches by an factor of up to 10,000x. To offer researchers even more flexibility and integration, we provide `rdiamond`, a dedicated interface package that allows programmatic handling of [DIAMOND2](https://github.com/bbuchfink/diamond) sequence searches directly through R. \n\nThe `rdiamond` package offers streamlined interface functions, enabling users to seamlessly run [DIAMOND2](https://github.com/bbuchfink/diamond) directly within R. Notably, it's designed to handle vast outputs, processing terabytes of [DIAMOND2](https://github.com/bbuchfink/diamond) hit files directly from the disk on a local machine, bypassing memory limitations.\n\nFurthermore, when paired with the [biomartr](https://github.com/ropensci/biomartr) R package, users have the convenience of automatically fetching large-scale genomic data and subsequently searching through it using rdiamond.\"\n\nThis version emphasizes the utility and integration capabilities of the rdiamond package while maintaining clarity.\n\n### Install `rdiamond`\n\n### For Linux Users:\n\nPlease install the `libpq-dev` library on you linux machine by typing into the terminal:\n\n```\nsudo apt-get install libpq-dev\n```\n\n### For all systems install `rdiamond` by typing\n\n```r\n# install Bioconductor\nif (!requireNamespace(\"BiocManager\", quietly = TRUE))\n    install.packages(\"BiocManager\")\nBiocManager::install()\n\n# install Biostrings -\u003e see here for different Biostrings verions:\n# http://bioconductor.org/about/release-announcements/\nBiocManager::install(c(\"Biostrings\"))\n\n# install.packages(\"devtools\")\n# install the current version of rdiamond on your system\ndevtools::install_github(\"drostlab/rdiamond\", build_vignettes = TRUE, dependencies = TRUE)\n```\n\n### Citation\n\nThis R package is not formally published yet, but please cite the following paper when using this software for your research:\n\n\u003e Buchfink B, Reuter K, Drost HG, \"Sensitive protein alignments at tree-of-life scale using DIAMOND\", Nature Methods 18, 366–368 (2021). [doi:10.1038/s41592-021-01101-x](https://www.nature.com/articles/s41592-021-01101-x)\n\n\n\n### Quick start\n \n```r\n# run diamond assuming that the diamond executable is available\n# via the system path ('diamond_exec_path = NULL') and using\n# sensitivity_mode = \"ultra-sensitive\"\ndiamond_example \u003c- rdiamond::diamond_protein_to_protein(\n              query   = system.file('seqs/qry_aa.fa', package = 'rdiamond'),\n              subject = system.file('seqs/sbj_aa.fa', package = 'rdiamond'),\n              sensitivity_mode = \"ultra-sensitive\",\n              output_path = tempdir(),\n              use_arrow_duckdb_connection  = FALSE)\n\n# look at DIAMOND results\ndiamond_example\n```\n\n```\nRunning diamond with 'diamond blastp  with query: /library/rdiamond/seqs/qry_aa.fa and subject: /library/rdiamond/seqs/sbj_aa.fa using 2 core(s) ...\n\n\nAlignment Sensitivity Mode:  --ultra-sensitive and max-target-seqs: 500\n\n\nMasking of low-complexity regions: TANTAN\n\n\ndiamond v2.0.4.142 (C) Max Planck Society for the Advancement of Science\nDocumentation, support and updates available at http://www.diamondsearch.org\n\n#CPU threads: 4\nScoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)\nDatabase input file: /library/rdiamond/seqs/sbj_aa.fa\nOpening the database file...  [0.001s]\nLoading sequences...  [0s]\nMasking sequences...  [0.003s]\nWriting sequences...  [0s]\nHashing sequences...  [0s]\nLoading sequences...  [0s]\nWriting trailer...  [0s]\nClosing the input file...  [0s]\nClosing the database file...  [0s]\nDatabase hash = e7cd8e84df51b22dd27f3c01d5765fe1\nProcessed 20 sequences, 9444 letters.\nTotal time = 0.006s\ndiamond v2.0.4.142 (C) Max Planck Society for the Advancement of Science\nDocumentation, support and updates available at http://www.diamondsearch.org\n\n#CPU threads: 2\nScoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)\nTemporary directory: /var/folders/yn/mgwl8_b56hz4v2c2vlfxj07w00076j/T//RtmpUCSsbL\nOpening the database...  [0.002s]\n#Target sequences to report alignments for: 500\nReference = /var/folders/yn/mgwl8_b56hz4v2c2vlfxj07w00076j/T//RtmpUCSsbL/sbj_aa.fa.dmnd\nSequences = 20\nLetters = 9444\nBlock size = 400000000\nOpening the input file...  [0s]\nOpening the output file...  [0s]\nLoading query sequences...  [0s]\nMasking queries...  [0.002s]\nBuilding query seed set... Algorithm: Double-indexed\n [0s]\nBuilding query histograms...  [0.008s]\nAllocating buffers...  [0s]\nLoading reference sequences...  [0s]\nMasking reference...  [0.001s]\nInitializing temporary storage...  [0.013s]\nBuilding reference histograms...  [0.007s]\nAllocating buffers...  [0s]\nProcessing query block 1, reference block 1/1, shape 1/64.\nBuilding reference seed array...  [0s]\nBuilding query seed array...  [0s]\nComputing hash join...  [0.001s]\nBuilding seed filter...  [0.001s]\nSearching alignments...  [0.002s]\n\n... \n\n\nBuilding reference seed array...  [0s]\nBuilding query seed array...  [0s]\nComputing hash join...  [0s]\nBuilding seed filter...  [0.033s]\nSearching alignments...  [0.001s]\nDeallocating buffers...  [0s]\nClearing query masking...  [0s]\nComputing alignments...  [0.065s]\nDeallocating reference...  [0s]\nLoading reference sequences...  [0s]\nDeallocating buffers...  [0s]\nDeallocating queries...  [0s]\nLoading query sequences...  [0s]\nClosing the input file...  [0s]\nClosing the output file...  [0s]\nClosing the database file...  [0s]\nDeallocating taxonomy...  [0s]\nTotal time = 0.797s\nReported 20 pairwise alignments, 20 HSPs.\n20 queries aligned.\n\n\nDIAMOND search finished successfully! \n\n\nThe DIAMOND output file was imported into the running R session. The DIAMOND output file has been stored at: /var/folders/RtmpUCSsbL/qry_aa_sbj_aa_blastp_eval_0.001.blast_tbl\n```\n\n```\nA tibble: 20 x 20\n   query_id subject_id perc_identity num_ident_match…\n   \u003cchr\u003e    \u003cchr\u003e              \u003cdbl\u003e            \u003cint\u003e\n 1 333554|… AT1G01010…          73.2              347\n 2 470181|… AT1G01020…          91.1              224\n 3 470180|… AT1G01030…          93.3              335\n 4 333551|… AT1G01040…          93.4             1840\n 5 909874|… AT1G01050…         100                213\n 6 470177|… AT1G01060…          87.5              567\n 7 918864|… AT1G01070…          92.6              339\n 8 909871|… AT1G01080…          89.3              268\n 9 470171|… AT1G01090…          96.8              420\n10 333544|… AT1G01110…          87.7              463\n11 918858|… AT1G01120…          99.2              525\n12 470161|… AT1G01140…          98.5              446\n13 918855|… AT1G01150…          72.6              207\n14 918854|… AT1G01160…          78.8              141\n15 311317|… AT1G01170…          85.6               83\n16 909860|… AT1G01180…          92.6              287\n17 311315|… AT1G01190…          94.2              502\n18 470156|… AT1G01200…          95.8              228\n19 311313|… AT1G01210…          95.3              102\n20 470155|… AT1G01220…          96.5             1019\n# … with 16 more variables: alig_length \u003cint\u003e,\n#   mismatches \u003cint\u003e, gap_openings \u003cint\u003e, n_gaps \u003cint\u003e,\n#   pos_match \u003cint\u003e, ppos \u003cdbl\u003e, q_start \u003cint\u003e,\n#   q_end \u003cint\u003e, q_len \u003cint\u003e, qcovhsp \u003cdbl\u003e, s_start \u003cint\u003e,\n#   s_end \u003cint\u003e, s_len \u003cint\u003e, evalue \u003cdbl\u003e,\n#   bit_score \u003cdbl\u003e, score_raw \u003cdbl\u003e\n```\n\nor \n\n```r\ndplyr::glimpse(diamond_example)\n```\n\n```\nRows: 20\nColumns: 20\n$ query_id          \u003cchr\u003e \"333554|PACid:16033839\", \"470181|PACid:16064328\", \"470180|PAC…\n$ subject_id        \u003cchr\u003e \"AT1G01010.1\", \"AT1G01020.1\", \"AT1G01030.1\", \"AT1G01040.1\", \"…\n$ perc_identity     \u003cdbl\u003e 73.2, 91.1, 93.3, 93.4, 100.0, 87.5, 92.6, 89.3, 96.8, 87.7, …\n$ num_ident_matches \u003cint\u003e 347, 224, 335, 1840, 213, 567, 339, 268, 420, 463, 525, 446, …\n$ alig_length       \u003cint\u003e 474, 246, 359, 1969, 213, 648, 366, 300, 434, 528, 529, 453, …\n$ mismatches        \u003cint\u003e 75, 22, 20, 58, 0, 71, 23, 25, 8, 65, 4, 6, 68, 30, 0, 20, 30…\n$ gap_openings      \u003cint\u003e 8, 0, 2, 7, 0, 5, 2, 2, 3, 0, 0, 1, 3, 2, 1, 1, 1, 0, 0, 0\n$ n_gaps            \u003cint\u003e 52, 0, 4, 71, 0, 10, 4, 7, 6, 0, 0, 1, 10, 8, 14, 3, 1, 0, 0,…\n$ pos_match         \u003cint\u003e 369, 231, 338, 1870, 213, 587, 342, 275, 425, 475, 527, 448, …\n$ ppos              \u003cdbl\u003e 77.8, 93.9, 94.2, 95.0, 100.0, 90.6, 93.4, 91.7, 97.9, 90.0, …\n$ q_start           \u003cint\u003e 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 16, 2, 4, 1, 6, 1, 1, 1\n$ q_end             \u003cint\u003e 466, 246, 355, 1963, 213, 640, 362, 299, 433, 528, 529, 453, …\n$ q_len             \u003cint\u003e 466, 246, 355, 1963, 213, 640, 362, 299, 433, 528, 529, 453, …\n$ qcovhsp           \u003cdbl\u003e 100.0, 100.0, 100.0, 99.9, 100.0, 100.0, 100.0, 100.0, 100.0,…\n$ s_start           \u003cint\u003e 1, 1, 1, 6, 1, 1, 1, 1, 1, 1, 1, 1, 5, 4, 2, 1, 1, 1, 1, 1\n$ s_end             \u003cint\u003e 430, 246, 359, 1910, 213, 646, 366, 294, 429, 528, 529, 452, …\n$ s_len             \u003cint\u003e 430, 246, 359, 1910, 213, 646, 366, 294, 429, 528, 529, 452, …\n$ evalue            \u003cdbl\u003e 1.83e-212, 2.79e-157, 4.66e-209, 0.00e+00, 4.51e-158, 0.00e+0…\n$ bit_score         \u003cdbl\u003e 584, 428, 568, 3541, 427, 1041, 613, 499, 816, 841, 1029, 866…\n$ score_raw         \u003cdbl\u003e 1506, 1100, 1464, 9181, 1098, 2691, 1581, 1284, 2108, 2172, 2…\n```\n\n## Discussions and Bug Reports\n\nI would be very happy to learn more about potential improvements of the concepts and functions provided in this package.\n\nFurthermore, in case you find some bugs, need additional (more flexible) functionality of parts of this package, or want to contribute to this project please let me know:\n\nhttps://github.com/drostlab/rdiamond/issues\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdrostlab%2Frdiamond","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdrostlab%2Frdiamond","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdrostlab%2Frdiamond/lists"}