{"id":15644961,"url":"https://github.com/terrytangyuan/dml","last_synced_at":"2025-08-26T12:08:33.890Z","repository":{"id":56937737,"uuid":"41424588","full_name":"terrytangyuan/dml","owner":"terrytangyuan","description":"R package for Distance Metric Learning","archived":false,"fork":false,"pushed_at":"2023-07-07T16:55:45.000Z","size":260,"stargazers_count":58,"open_issues_count":10,"forks_count":30,"subscribers_count":16,"default_branch":"master","last_synced_at":"2025-07-06T22:42:15.965Z","etag":null,"topics":["dimensionality-reduction","distance-metric-learning","machine-learning","metric-learning","r","statistics"],"latest_commit_sha":null,"homepage":null,"language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/terrytangyuan.png","metadata":{"files":{"readme":"README.md","changelog":"NEWS","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null},"funding":{"github":"terrytangyuan"}},"created_at":"2015-08-26T12:35:54.000Z","updated_at":"2024-10-20T17:02:13.000Z","dependencies_parsed_at":"2022-08-21T01:10:09.702Z","dependency_job_id":"272c645b-e1eb-44c8-9e62-56392a64ef65","html_url":"https://github.com/terrytangyuan/dml","commit_stats":{"total_commits":91,"total_committers":12,"mean_commits":7.583333333333333,"dds":0.5934065934065934,"last_synced_commit":"b3d9a11171663221600b6c44291c58fc1f623c71"},"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/terrytangyuan/dml","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/terrytangyuan%2Fdml","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/terrytangyuan%2Fdml/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/terrytangyuan%2Fdml/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/terrytangyuan%2Fdml/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/terrytangyuan","download_url":"https://codeload.github.com/terrytangyuan/dml/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/terrytangyuan%2Fdml/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":264705602,"owners_count":23652156,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dimensionality-reduction","distance-metric-learning","machine-learning","metric-learning","r","statistics"],"created_at":"2024-10-03T12:03:44.890Z","updated_at":"2025-08-21T00:32:00.640Z","avatar_url":"https://github.com/terrytangyuan.png","language":"R","funding_links":["https://github.com/sponsors/terrytangyuan","https://github.com/sponsors/terrytangyuan)!"],"categories":[],"sub_categories":[],"readme":"**Note**: This package has been maintained by [@terrytangyuan](https://github.com/terrytangyuan) since 2015. Please [consider sponsoring](https://github.com/sponsors/terrytangyuan)!\n\n[![JOSS DOI](https://joss.theoj.org/papers/10.21105/joss.01036/status.svg)](https://doi.org/10.21105/joss.01036)\n[![CRAN Status](https://www.r-pkg.org/badges/version/dml)](https://cran.r-project.org/package=dml)\n[![Coverage Status](https://coveralls.io/repos/terrytangyuan/dml/badge.svg?branch=master)](https://coveralls.io/r/terrytangyuan/dml?branch=master)\n[![Downloads from the RStudio CRAN mirror](https://cranlogs.r-pkg.org/badges/grand-total/dml)](https://cran.r-project.org/package=dml)\n[![License](https://img.shields.io/:license-mit-blue.svg?style=flat)](https://badges.mit-license.org)\n[![Zenodo DOI](https://zenodo.org/badge/41424588.svg)](https://zenodo.org/badge/latestdoi/41424588)\n\n# dml (Distance Metric Learning in R)\n\nR package for a collection of *Distance Metric Learning* algorithms, including global and local methods such as *Relevant Component Analysis*, *Discriminative Component Analysis*, *Local Fisher Discriminant Analysis*, etc. These distance metric learning methods are widely applied in feature extraction, dimensionality reduction, clustering, classification, information retrieval, and computer vision problems.\n\n## Installation\n\nInstall the current release from CRAN:\n\n```r\ninstall.packages(\"dml\")\n```\n\nOr, try the latest development version from GitHub:\n\n```r\ndevtools::install_github(\"terrytangyuan/dml\")\n```\n\n## Examples\n\n### Relevant Component Analysis\n\n```r\nlibrary(\"MASS\")\n\n# generate synthetic multivariate normal data\nset.seed(42)\n\nk \u003c- 100L # sample size of each class\nn \u003c- 3L # specify how many classes\nN \u003c- k * n # total sample size\n\nx1 \u003c- mvrnorm(k, mu = c(-16, 8), matrix(c(15, 1, 2, 10), ncol = 2))\nx2 \u003c- mvrnorm(k, mu = c(0, 0), matrix(c(15, 1, 2, 10), ncol = 2))\nx3 \u003c- mvrnorm(k, mu = c(16, -8), matrix(c(15, 1, 2, 10), ncol = 2))\nx \u003c- as.data.frame(rbind(x1, x2, x3)) # predictors\ny \u003c- gl(n, k) # response\n\n# fully labeled data set with 3 classes\n# need to use a line in 2D to classify\nplot(x[, 1L], x[, 2L],\n  bg = c(\"#E41A1C\", \"#377EB8\", \"#4DAF4A\")[y],\n  pch = rep(c(22, 21, 25), each = k)\n)\nabline(a = -10, b = 1, lty = 2)\nabline(a = 12, b = 1, lty = 2)\n```\n\n\u003cimg src=\"docs/imgs/rca-example-part1.png\"/\u003e\n\n```r\n# generate synthetic chunklets\nchunks \u003c- vector(\"list\", 300)\nfor (i in 1:100) chunks[[i]] \u003c- sample(1L:100L, 10L)\nfor (i in 101:200) chunks[[i]] \u003c- sample(101L:200L, 10L)\nfor (i in 201:300) chunks[[i]] \u003c- sample(201L:300L, 10L)\n\nchks \u003c- x[unlist(chunks), ]\n\n# make \"chunklet\" vector to feed the chunks argument\nchunksvec \u003c- rep(-1L, nrow(x))\nfor (i in 1L:length(chunks)) {\n  for (j in 1L:length(chunks[[i]])) {\n    chunksvec[chunks[[i]][j]] \u003c- i\n  }\n}\n\n# relevant component analysis\nrcs \u003c- rca(x, chunksvec)\n\n# learned transformation of the data\nrcs$A\n#\u003e           [,1]       [,2]\n#\u003e [1,] -3.181484 -0.8812647\n#\u003e [2,] -1.196200  2.3438640\n\n# learned Mahalanobis distance metric\nrcs$B\n#\u003e           [,1]     [,2]\n#\u003e [1,] 10.898467 1.740125\n#\u003e [2,]  1.740125 6.924592\n\n# whitening transformation applied to the chunklets\nchkTransformed \u003c- as.matrix(chks) %*% rcs$A\n\n# original data after applying RCA transformation\n# easier to classify - using only horizontal lines\nxnew \u003c- rcs$newX\nplot(xnew[, 1L], xnew[, 2L],\n  bg = c(\"#E41A1C\", \"#377EB8\", \"#4DAF4A\")[gl(n, k)],\n  pch = c(rep(22, k), rep(21, k), rep(25, k))\n)\nabline(a = -15, b = 0, lty = 2)\nabline(a = 16, b = 0, lty = 2)\n```\n\n\u003cimg src=\"docs/imgs/rca-example-part2.png\"/\u003e\n\n### Other Examples\n\nFor examples of Local Fisher Discriminant Analysis, please take a look at the separate package [here](https://github.com/terrytangyuan/lfda). For examples of all other implemented algorithms, please take a look at the dml [package reference manual](https://cran.r-project.org/web/packages/dml/dml.pdf). \n\n## Brief Introduction\n\nDistance metric is widely used in the machine learning literature. We used to choose a distance metric according to a priori (Euclidean Distance , L1 Distance, etc.) or according to the result of cross validation within small class of functions (e.g. choosing order of polynomial for a kernel). Actually, with priori knowledge of the data, we could learn a more suitable distance metric with (semi-)supervised distance metric learning techniques. dml is such an R package aims to implement a collection of algorithms for (semi-)supervised distance metric learning. These distance metric learning methods are widely applied in feature extraction, dimensionality reduction, clustering, classification, information retrieval, and computer vision problems.\n\n## Algorithms\n\nAlgorithms planned in the first development stage:\n\n  * Supervised Global Distance Metric Learning:\n  \n    * Relevant Component Analysis (RCA) - implemented\n    * Kernel Relevant Component Analysis (KRCA)\n    * Discriminative Component Analysis (DCA) - implemented\n    * Kernel Discriminative Component Analysis (KDCA)\n    * Global Distance Metric Learning by Convex Programming - implemented\n\n  * Supervised Local Distance Metric Learning:\n\n    * Local Fisher Discriminant Analysis - implemented\n    * Kernel Local Fisher Discriminant Analysis - implemented\n    * Information-Theoretic Metric Learning (ITML)\n    * Large Margin Nearest Neighbor Classifier (LMNN)\n    * Neighbourhood Components Analysis (NCA)\n    * Localized Distance Metric Learning (LDM)\n\nThe algorithms and routines might be adjusted during developing.\n\n## Contribute \u0026 Code of Conduct\n\nTo contribute to this project, please take a look at the [Contributing Guidelines](CONTRIBUTING.md) first. Please note that this project is released with a [Contributor Code of Conduct](CODE_OF_CONDUCT.md). By contributing to this project, you agree to abide by its terms.\n\n## Contact\n\nContact the maintainer of this package:\nYuan Tang \u003cterrytangyuan@gmail.com\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fterrytangyuan%2Fdml","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fterrytangyuan%2Fdml","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fterrytangyuan%2Fdml/lists"}