{"id":13423747,"url":"https://github.com/mpadge/spatialcluster","last_synced_at":"2025-03-20T01:31:44.491Z","repository":{"id":77959961,"uuid":"125113674","full_name":"mpadge/spatialcluster","owner":"mpadge","description":"spatially-constrained clustering in R","archived":false,"fork":false,"pushed_at":"2022-11-10T10:57:16.000Z","size":13098,"stargazers_count":30,"open_issues_count":11,"forks_count":6,"subscribers_count":5,"default_branch":"main","last_synced_at":"2024-08-01T00:38:41.518Z","etag":null,"topics":["cluster","clustering-algorithm","r","spatial"],"latest_commit_sha":null,"homepage":"https://mpadge.github.io/spatialcluster/","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mpadge.png","metadata":{"files":{"readme":"README.Rmd","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":"codemeta.json"}},"created_at":"2018-03-13T20:56:49.000Z","updated_at":"2023-08-23T09:09:47.000Z","dependencies_parsed_at":"2023-02-25T12:15:26.315Z","dependency_job_id":null,"html_url":"https://github.com/mpadge/spatialcluster","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mpadge%2Fspatialcluster","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mpadge%2Fspatialcluster/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mpadge%2Fspatialcluster/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mpadge%2Fspatialcluster/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mpadge","download_url":"https://codeload.github.com/mpadge/spatialcluster/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":219865982,"owners_count":16555921,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cluster","clustering-algorithm","r","spatial"],"created_at":"2024-07-31T00:00:41.652Z","updated_at":"2025-03-20T01:31:44.447Z","avatar_url":"https://github.com/mpadge.png","language":"C++","funding_links":[],"categories":["C++"],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n```{r, echo = FALSE}\nknitr::opts_chunk$set (\n    collapse = TRUE,\n    comment = \"#\u003e\",\n    fig.path = \"man/figures/README-\"\n)\n```\n\n[![R build status](https://github.com/mpadge/spatialcluster/workflows/R-CMD-check/badge.svg)](https://github.com/mpadge/spatialcluster/actions?query=workflow%3AR-CMD-check)\n[![Project Status: WIP](http://www.repostatus.org/badges/latest/wip.svg)](http://www.repostatus.org/#wip)\n[![codecov](https://codecov.io/gh/mpadge/spatialcluster/branch/master/graph/badge.svg)](https://codecov.io/gh/mpadge/spatialcluster)\n\n# spatialcluster\n\nAn **R** package for spatially-constrained clustering using either distance or\ncovariance matrices. \"*Spatially-constrained*\" means that the data from which\nclusters are to be formed also map on to spatial coordinates, and the\nconstraint is that clusters must be spatially contiguous.\n\nThe package includes both an implementation of the\nREDCAP collection of efficient yet approximate algorithms described in [D. Guo's\n2008 paper, \"Regionalization with dynamically constrained agglomerative\nclustering and\npartitioning.\"](https://www.tandfonline.com/doi/abs/10.1080/13658810701674970)\n(pdf available\n[here](https://pdfs.semanticscholar.org/ead1/7df8aaa1aed0e433b3ae1ec1ec5c7e785b2b.pdf)),\nwith extension to covariance matrices, and a new technique for computing\nclusters using complete data sets. The package is also designed to analyse\nmatrices of spatial interactions (counts, densities) between sets of origin and\ndestination points. The spatial structure of interaction matrices is able to be\nstatistically analysed to yield both global statistics for the overall spatial\nstructure, and local statistics for individual clusters.\n\n\n## Installation\n\nThe easiest way to install `spatialcluster` is be enabling the [corresponding\n`r-universe`](https://mpadge.r-universe.dev/):\n\n```{r r-univ, eval = FALSE}\noptions (repos = c (\n    mpadge = \"https://mpadge.r-universe.dev\",\n    CRAN = \"https://cloud.r-project.org\"\n))\n```\n\nThe package can then be installed as usual with,\n\n```{r install, eval = FALSE}\ninstall.packges (\"spatialcluster\")\n```\n\nAlternatively, the package can also be installed using any of the following\noptions:\n\n```{r gh-installation, eval = FALSE}\n# install.packages(\"remotes\")\nremotes::install_git (\"https://codeberg.org/mpadge/spatialcluster\")\nremotes::install_git (\"https://git.sr.ht/~mpadge/spatialcluster\")\nremotes::install_bitbucket (\"mpadge/spatialcluster\")\nremotes::install_gitlab (\"mpadge/spatialcluster\")\nremotes::install_github (\"mpadge/spatialcluster\")\n```\n\n## Usage\n\nThe two main functions, `scl_redcap()` and `scl_full()`, implement different\nalgorithms for spatial clustering. The former implements the REDCAP collection\nof efficient yet approximate algorithms described in [D. Guo's 2008 paper,\n\"Regionalization with dynamically constrained agglomerative clustering and\npartitioning.\"](https://www.tandfonline.com/doi/abs/10.1080/13658810701674970)\n(pdf available\n[here](https://pdfs.semanticscholar.org/ead1/7df8aaa1aed0e433b3ae1ec1ec5c7e785b2b.pdf)),\nwith extension here to apply clustering to covariance matrices. These\nalgorithms are computationally efficient yet generate only *approximate*\nestimates of underlying clusters. The second function, `scl_full()`, trades\ncomputational efficiency for accuracy, through generating clustering schemes\nusing all available data.\n\nIn short:\n\n- `scl_full()` should always be preferred as long as it returns results within\n  a reasonable amount of time\n- `scl_redcap()` should be used only where data are too large for `scl_full()`\n  to be run in a reasonable time.\n\nFor clustering a group of `n` points, both of these functions require three\nmain arguments:\n\n1. A rectangular matrix of spatial coordinates of points to be clustered (`n`\n    rows; at least 2 columns);\n2. An `n`-by-`n` square matrix quantifying relationships between those points;\n3. A single value (`ncl`) specifying the desired number of clusters.\n\nThe following code demonstrates usage with randomly-generated data:\n```{r}\nset.seed (1)\nn \u003c- 100\nxy \u003c- matrix (runif (2 * n), ncol = 2)\ndmat \u003c- matrix (runif (n^2), ncol = n)\n```\n\nThe load the package and call the function:\n\n```{r full-single, echo = TRUE, eval = TRUE}\nlibrary (spatialcluster)\nscl \u003c- scl_full (xy, dmat, ncl = 8)\nplot (scl)\n```\n\nBoth functions return a `list` with the following components:\n\n```{r list-components}\nnames (scl)\n```\n\n- `tree` details distances and cluster numbers for all pairwise comparisons\n  between objects.\n- `merges` details increasing distances at which each pair of objects was\n  merged into a single cluster.\n- `ord` provides the order of the merges (for `scl_full()` only).\n- `nodes` records the spatial coordinates of each point (node) of the input\n  data.\n- `pars` retains the parameters used to call the clustering function.\n- `statsitics` returns the clustering statistics, both for individual clusters\n  and an overall global statistic for the clustering scheme as a whole.\n\nSee the \"_Get Started_\" vignette for more details.\n\n## A Cautionary Note\n\nThe following plot compares the results of applying four different clustering\nalgorithms to the same data.\n\n```{r cautionary, eval = TRUE, fig.width = 7, fig.height = 7}\nlibrary (ggplot2)\nlibrary (gridExtra)\nscl \u003c- scl_full (xy, dmat, ncl = 8, linkage = \"single\")\np1 \u003c- plot (scl) + ggtitle (\"full-single\")\nscl \u003c- scl_redcap (xy, dmat, ncl = 8, linkage = \"single\")\np2 \u003c- plot (scl) + ggtitle (\"redcap-single\")\nscl \u003c- scl_redcap (xy, dmat, ncl = 8, linkage = \"average\")\np3 \u003c- plot (scl) + ggtitle (\"redcap-average\")\nscl \u003c- scl_redcap (xy, dmat, ncl = 8, linkage = \"complete\")\np4 \u003c- plot (scl) + ggtitle (\"redcap-complete\")\n\ngrid.arrange (p1, p2, p3, p4, ncol = 2)\n```\n\n\nThis example illustrates the universal danger in all clustering algorithms: they\ncan not fail to produce results, even when the data fed to them are definitely\ndevoid of any information as in this example. Clustering algorithms should only\nbe applied to reflect a very specific hypothesis for why data should be\nclustered in the first place; spatial clustering algorithms should only be\napplied to reflect two very specific hypothesis for (i) why data should be\nclustered at all, and (ii) why those clusters should manifest a spatial\npattern.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmpadge%2Fspatialcluster","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmpadge%2Fspatialcluster","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmpadge%2Fspatialcluster/lists"}