{"id":24182993,"url":"https://github.com/ms609/treedist","last_synced_at":"2026-05-11T09:56:24.924Z","repository":{"id":43058412,"uuid":"196188301","full_name":"ms609/TreeDist","owner":"ms609","description":"Calculate distances between phylogenetic trees in R","archived":false,"fork":false,"pushed_at":"2025-02-06T16:44:42.000Z","size":64683,"stargazers_count":32,"open_issues_count":24,"forks_count":6,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-04-02T02:38:28.541Z","etag":null,"topics":["phylogenetic-trees","r","r-package","rstats","tree-distances","trees"],"latest_commit_sha":null,"homepage":"https://ms609.github.io/TreeDist/","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ms609.png","metadata":{"files":{"readme":"README.md","changelog":"NEWS.md","contributing":"CONTRIBUTING.md","funding":null,"license":null,"code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":"codemeta.json"}},"created_at":"2019-07-10T10:55:31.000Z","updated_at":"2025-02-06T16:41:19.000Z","dependencies_parsed_at":"2024-01-05T14:30:46.308Z","dependency_job_id":"c1a77f80-b744-4f77-b71d-19b9a3c37648","html_url":"https://github.com/ms609/TreeDist","commit_stats":{"total_commits":2101,"total_committers":5,"mean_commits":420.2,"dds":"0.13755354593050928","last_synced_commit":"2d1ba57385a00c141d47908e953bb87d1af5d46e"},"previous_names":[],"tags_count":32,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ms609%2FTreeDist","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ms609%2FTreeDist/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ms609%2FTreeDist/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ms609%2FTreeDist/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ms609","download_url":"https://codeload.github.com/ms609/TreeDist/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246785436,"owners_count":20833490,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["phylogenetic-trees","r","r-package","rstats","tree-distances","trees"],"created_at":"2025-01-13T08:45:56.451Z","updated_at":"2026-02-13T15:29:39.589Z","avatar_url":"https://github.com/ms609.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"# TreeDist\n\n[![Project Status: The project has reached a stable, usable state but is no longer being actively developed; support/maintenance will be provided as time allows.](http://www.repostatus.org/badges/latest/inactive.svg)](https://www.repostatus.org/#inactive)\n[![codecov](https://codecov.io/gh/ms609/TreeDist/branch/master/graph/badge.svg)](https://codecov.io/gh/ms609/TreeDist)\n[![CRAN Status Badge](http://www.r-pkg.org/badges/version/TreeDist)](https://cran.r-project.org/package=TreeDist)\n[![CRAN Downloads](http://cranlogs.r-pkg.org/badges/TreeDist)](https://cran.r-project.org/package=TreeDist)\n[![DOI](https://zenodo.org/badge/196188301.svg)](https://zenodo.org/badge/latestdoi/196188301)\n\n'TreeDist' is an R package that implements a suite of metrics that quantify the\ntopological distance between pairs of unweighted phylogenetic trees.\nIt also includes a simple 'Shiny' application to allow the visualization of\ndistance-based tree spaces, and functions to calculate the information content\nof trees and splits.\n\n'TreeDist' primarily employs metrics in the category of\n'generalized Robinson–Foulds distances': they are based on comparing splits\n(bipartitions) between trees, and thus reflect the relationship data within \ntrees, with no reference to branch lengths.\n\n\n## Generalized RF distances\n\nThe [Robinson-Foulds distance](https://ms609.github.io/TreeDist/articles/Robinson-Foulds.html)\nsimply tallies the number of non-trivial splits (sometimes inaccurately\ntermed clades, nodes or edges) that occur in both trees – any splits that are\nnot perfectly identical contribute one point to the distance score of zero, \nhowever similar or different they are.\nBy overlooking potential similarities between almost-identical splits, \nthis conservative approach has undesirable properties.\n\n['Generalized' RF metrics](https://ms609.github.io/TreeDist/articles/Generalized-RF.html)\ngenerate _matchings_ that pair splits in one tree with similar splits in\nthe other.\nEach pair of splits is assigned a similarity score; the sum of these scores in\nthe optimal matching then quantifies the similarity between two trees.\n\nDifferent ways of calculating the the similarity between a pair of splits\nlead to different tree distance metrics, implemented in the functions below:\n\n* [`MutualClusteringInfo()`](https://ms609.github.io/TreeDist/reference/TreeDistance.html), [`SharedPhylogeneticInfo()`](https://ms609.github.io/TreeDist/reference/TreeDistance.html)\n    \n    Smith (2020) scores matchings based on the amount of information\n    that one partition contains about the other.  The Mutual Phylogenetic\n    Information assigns zero similarity to split pairs that cannot\n    both exist on a single tree;  The Mutual Clustering Information metric is \n    more forgiving, and exhibits more desirable behaviour; it is the \n    recommended metric for tree comparison.\n    (Its complement, \n    [`ClusteringInfoDistance()`](https://ms609.github.io/TreeDist/reference/TreeDistance.html),\n    returns a tree distance.)\n    \n    [![Introduction to the Clustering Information Distance](man/figures/CID_talk.png)](https://durham.cloud.panopto.eu/Panopto/Pages/Viewer.aspx?id=ca5ede19-d21a-40ce-8b9e-ac6e00d7e2c0)\n\n* [`NyeSimilarity()`](https://ms609.github.io/TreeDist/reference/NyeSimilarity.html)\n    \n    Nye _et al._ (2006) score matchings according to the size of the largest \n    split that is consistent with both of them, normalized against \n    the Jaccard index.  This approach is extended by B\u0026ouml;cker _et al_. (2013)\n    with the Jaccard-Robinson-Foulds metric (function \n    [`JaccardRobinsonFoulds()`](https://ms609.github.io/TreeDist/reference/JaccardRobinsonFoulds.html)).\n   \n* [`MatchingSplitDistance()`](https://ms609.github.io/TreeDist/reference/MatchingSplitDistance.html)\n    \n    Bogdanowicz and Giaro (2012) and  Lin _et al._ (2012) independently proposed\n    counting the number of 'mismatched' leaves in a pair of splits.\n    [`MatchingSplitInfoDistance()`](https://ms609.github.io/TreeDist/reference/TreeDistance.html)\n    provides an information-based equivalent (Smith 2020).\n    \n\nThe package also implements the variation of the path distance \nproposed by Kendal and Colijn (2016) (function\n[`KendallColijn()`](https://ms609.github.io/TreeDist/reference/KendallColijn.html)),\napproximations of the Nearest-Neighbour Interchange (NNI) distance (function\n[`NNIDist()`](https://ms609.github.io/TreeDist/reference/NNIDist.html); \nfollowing Li _et al._ (1996)), and calculates the size (function\n[`MASTSize()`](https://ms609.github.io/TreeDist/reference/MASTSize.html)) and \ninformation content (function\n[`MASTInfo()`](https://ms609.github.io/TreeDist/reference/MASTSize.html)) of the \nMaximum Agreement Subtree.\n\nFor an implementation of the Tree Bisection and Reconnection (TBR) distance, see \nthe package '[TBRDist](https://ms609.github.io/TBRDist/index.html)'.\n\n# Installation\n\nInstall and load the library from CRAN as follows:\n```r\ninstall.packages('TreeDist')\nlibrary('TreeDist')\n```\n\nYou can install the development version of the package with:\n```r\nif(!require(\"curl\")) install.packages(\"curl\")\nif(!require(\"remotes\")) install.packages(\"remotes\")\nremotes::install_github(\"ms609/TreeDist\")\n```\n\n# Tree space analysis\n\nConstruct tree spaces and readily visualize projected landscapes, avoiding\ncommon analytical pitfalls (Smith, 2022),\nusing the inbuilt graphical user interface (Shiny GUI):\n\n```r\nTreeDist::MapTrees()\n```\n\n![image](https://user-images.githubusercontent.com/1695515/164730749-0e4cad5e-dcd5-47c7-80ef-3464e776e0a6.png)\n\nSerious analysts should consult the\n[vignette](https://ms609.github.io/TreeDist/articles/treespace.html)\nfor a command-line interface.\n\n\n# Documentation\n\n- [Using 'TreeDist'](https://ms609.github.io/TreeDist/articles/Using-TreeDist.html)\n\n- [Package functions](https://ms609.github.io/TreeDist/reference/index.html)\n\n- [Tree spaces with 'TreeDist'](https://ms609.github.io/TreeDist/articles/treespace.html)\n\n- [All vignettes](https://ms609.github.io/TreeDist/articles/)\n\n# See also\n\nOther R packages implementing tree distance functions include:\n\n* '[ape](http://ape-package.ird.fr/)':\n    - `cophenetic.phylo()`: Cophenetic distance\n    - `dist.topo()`: Path (topological) distance, Robinson-Foulds distance.\n* '[phangorn](https://cran.r-project.org/package=phangorn)'\n    - `treedist()`: Path, Robinson-Foulds and approximate SPR distances.\n* '[Quartet](http://ms609.github.io/Quartet/)': Triplet and Quartet distances, \n  using the tqDist algorithm.\n* '[TBRDist](http://ms609.github.io/TBRDist/)': TBR and SPR distances on \n  unrooted trees, using the 'uspr' C library.\n* '[treespace](https://github.com/thibautjombart/treespace)': Kendall-Colijn\n  distance and tree space visualizations.\n* '[distory](https://cran.r-project.org/package=distory)' (unmaintained):\n  Geodesic distance\n\n# References\n\n- Böcker, S. _et al._ (2013) [The Generalized Robinson-Foulds\nmetric](https://dx.doi.org/10.1007/978-3-642-40453-5_13).\nAlgorithms in Bioinformatics. WABI 2013.\n_Lecture Notes in Computer Science_, 8126, 156–69.\n\n- Bogdanowicz, D. and Giaro, K. (2012) [Matching split distance for unrooted\nbinary phylogenetic trees](https://dx.doi.org/10.1109/TCBB.2011.48).\n_IEEE/ACM Transactions on Computational Biology and Bioinformatics_, 9, 150–160. \n\n- Kendall, M. and Colijn, C. (2016) [Mapping phylogenetic trees to reveal\ndistinct patterns of evolution](https://dx.doi.org/10.1093/molbev/msw124).\n_Mol Biol Evol_, 33, 2735–2743.\n\n- Li, M., Tromp, J. and Zhang, L.-X. (1996) [Some notes on the nearest neighbour\ninterchange distance](https://dx.doi.org/10.1007/3-540-61332-3_168). \n_Computing and Combinatorics_, Goos, G., Hartmanis, J., Leeuwen, J., Cai, J.-Y.,\nand Wong, C. K., eds. Springer, Berlin. 343–351.\n\n- Nye, T.M.W. _et al._ (2006) [A novel algorithm and web-based tool for\ncomparing two alternative phylogenetic\ntrees](https://dx.doi.org/10.1093/bioinformatics/bti720).\n_Bioinformatics_, 22, 117–119.\n\n- Smith, M.R. (2020) [Information theoretic Generalized Robinson-Foulds\nmetrics for comparing phylogenetic \ntrees](https://dx.doi.org/10.1093/bioinformatics/btaa614).\n_Bioinformatics_, 36, 5007–5013.\n\n- Smith, M.R. (2022) [Robust analysis of phylogenetic tree\nspace](https://dx.doi.org/10.1093/sysbio/syab100).\n_Systematic Biology_, 71, 1255–1270.\n\n\nPlease note that the 'TreeDist' project is released with a\n[Contributor Code of Conduct](https://ms609.github.io/TreeDist/CODE_OF_CONDUCT.html).\nBy contributing to this project, you agree to abide by its terms.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fms609%2Ftreedist","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fms609%2Ftreedist","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fms609%2Ftreedist/lists"}