{"id":17293383,"url":"https://github.com/xuyxu/clustering","last_synced_at":"2025-08-25T07:32:55.385Z","repository":{"id":46168071,"uuid":"78267263","full_name":"xuyxu/Clustering","owner":"xuyxu","description":"Clustering / Subspace Clustering Algorithms on MATLAB","archived":false,"fork":false,"pushed_at":"2020-10-28T11:21:08.000Z","size":37,"stargazers_count":230,"open_issues_count":0,"forks_count":89,"subscribers_count":12,"default_branch":"master","last_synced_at":"2025-08-24T12:48:03.306Z","etag":null,"topics":["clustering","clustering-algorithm","subspace-clustering","subspace-kmeans"],"latest_commit_sha":null,"homepage":"","language":"MATLAB","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/xuyxu.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-01-07T08:13:44.000Z","updated_at":"2025-07-11T02:28:57.000Z","dependencies_parsed_at":"2022-09-24T15:01:12.631Z","dependency_job_id":null,"html_url":"https://github.com/xuyxu/Clustering","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/xuyxu/Clustering","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xuyxu%2FClustering","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xuyxu%2FClustering/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xuyxu%2FClustering/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xuyxu%2FClustering/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/xuyxu","download_url":"https://codeload.github.com/xuyxu/Clustering/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xuyxu%2FClustering/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":272024541,"owners_count":24860528,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-25T02:00:12.092Z","response_time":1107,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["clustering","clustering-algorithm","subspace-clustering","subspace-kmeans"],"created_at":"2024-10-15T10:47:59.997Z","updated_at":"2025-08-25T07:32:55.355Z","avatar_url":"https://github.com/xuyxu.png","language":"MATLAB","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Clustering/Subspace Clustering Algorithms on MATLAB\n\n**This repo is no longer in active development. However, any problem on implementations of existing algorithms is welcomed. [Oct, 2020]**\n\n## 1. Clustering Algorithms\n- **K-means**\n- **K-means++**\n    - Generally speaking, this algorithm is similar to **K-means**;\n    - Unlike classic K-means randomly choosing initial centroids, a better initialization procedure is integrated into **K-means++**, where observations far from existing centroids have higher probabilities of being chosen as the next centroid.\n    - The initializeation procedure can be achieved using Fitness Proportionate Selection.\n- **ISODATA (Iterative Self-Organizing Data Analysis)**\n    - To be brief, **ISODATA** introduces two additional operations: Splitting and Merging;\n    - When the number of observations within one class is less than one pre-defined threshold, **ISODATA** merges two classes with minimum between-class distance; \n    - When the within-class variance of one class exceeds one pre-defined threshold, **ISODATA** splits this class into two different sub-classes.\n- **Mean Shift**\n\t- For each point *x*, find neighbors, calculate mean vector *m*, update *x = m*, until *x == m*;\n\t- Non-parametric model, no need to specify the number of classes;\n\t- No structure priori.\n- **DBSCAN (Density-Based Spatial Clustering of Application with Noise)**\n\t- Starting with pre-selected core objects, DBSCAN extends each cluster based on the connectivity between data points;\n\t- DBSCAN takes noisy data into consideration, hence robust to outliers;\n\t- Choosing good parameters can be hard without prior knowledge;\n- **Gaussian Mixture Model (GMM)**\n- **LVQ (Learning Vector Quantization)**\n\n## 2. Subspace Clustering Algorithms\n- **Subspace K-means**\n    - This algorithm directly extends **K-means** to Subspace Clustering through multiplying each dimension *d\u003csub\u003ej\u003c/sub\u003e* by one weight *m\u003csub\u003ej\u003c/sub\u003e* (s.t. sum(*m\u003csub\u003ej\u003c/sub\u003e*)=1, *j*=1,2,...,*p*);\n    - It can be efficiently sovled in an Expectation-Maximization (EM) fashion. In each E-step, it updates weights, centroids using Lagrange Multiplier;\n    - This rough algorithm suffers from the problem on its favor of using just a few dimensions when clustering sparse data;\n- **Entropy-Weighting Subspace K-means**\n    - Generally speaking, this algorithm is similar to **Subspace K-means**;\n    - In addition, it introduces one regularization item related to weight entropy into the objective function, in order to mitigate the aforementioned problem in **Subspace K-means**.\n    - Apart from its succinctness and efficiency, it works well on a broad range of real-world datasets.\n\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxuyxu%2Fclustering","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fxuyxu%2Fclustering","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fxuyxu%2Fclustering/lists"}