{"id":19140054,"url":"https://github.com/sahar-dev/clustering","last_synced_at":"2025-02-22T19:12:36.146Z","repository":{"id":209926973,"uuid":"725283223","full_name":"Sahar-dev/Clustering","owner":"Sahar-dev","description":"The code is designed to analyze a dataset using K-Means clustering and Hierarchical Agglomerative Clustering (CAH) using R.","archived":false,"fork":false,"pushed_at":"2023-11-29T20:27:17.000Z","size":321,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-03T14:50:06.098Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Sahar-dev.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2023-11-29T20:25:24.000Z","updated_at":"2023-11-29T20:27:21.000Z","dependencies_parsed_at":"2023-11-29T21:31:47.882Z","dependency_job_id":"d69575f9-e70c-4a1f-893d-63298e50fb2e","html_url":"https://github.com/Sahar-dev/Clustering","commit_stats":null,"previous_names":["sahar-dev/clustering"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Sahar-dev%2FClustering","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Sahar-dev%2FClustering/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Sahar-dev%2FClustering/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Sahar-dev%2FClustering/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Sahar-dev","download_url":"https://codeload.github.com/Sahar-dev/Clustering/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240222507,"owners_count":19767458,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-09T07:16:06.488Z","updated_at":"2025-02-22T19:12:36.127Z","avatar_url":"https://github.com/Sahar-dev.png","language":"Jupyter Notebook","readme":"# Clustering Analysis Documentation\n\n## Overview\n\nThis documentation delineates the R code implemented for sophisticated clustering analysis. The code employs K-Means clustering and Hierarchical Agglomerative Clustering (CAH) to scrutinize a dataset comprehensively. The analysis encompasses descriptive statistics, visualizations, and a meticulous exploration of optimal cluster configurations.\n\n## Instructions\n\n### 1. Loading the Data\n\nThe dataset is ingested from the \"data.csv\" file utilizing the `read.csv2` function. This foundational step sets the stage for subsequent intricate clustering analysis.\n\n### 2. Descriptive Statistics\n\nDescriptive statistics are systematically computed for each variable in the dataset, offering a nuanced understanding of the data's characteristics. This preliminary exploration informs the subsequent analytical processes.\n\n### 3. K-Means Clustering\n\nK-Means clustering is orchestrated with a meticulous approach, utilizing the `kmeans` function to segment the dataset into three clusters. The ensuing clusters are vividly portrayed through an insightful scatter plot, meticulously crafted using the `fviz_cluster` function.\n\n### 4. Hierarchical Clustering (CAH)\n\nHierarchical Agglomerative Clustering (CAH) is meticulously applied to the dataset, orchestrating a three-cluster configuration. The visually rich dendrogram produced with `fviz_dend` serves as an illuminating representation of the hierarchical clustering results.\n\n### 5. Inertia Analysis and Elbow Method\n\nA judicious inertia analysis is undertaken to ascertain the optimal number of clusters. The code systematically iterates through a spectrum of cluster numbers, calculating nuanced inertias and producing a visually compelling plot. The elbow method is judiciously applied to discern the pivotal point indicating the optimal cluster count.\n\n### 6. Visualizing Inertia Analysis\n\nThe outcomes of the inertia analysis are elegantly visualized through a line plot. Within-cluster, between-cluster, and total inertia are meticulously depicted for each cluster configuration, providing stakeholders with a profound understanding of the clustering dynamics.\n\n### 7. R-Squared Evolution in K-Means Clustering\n\nAn intricate exploration of R-squared values is orchestrated, offering a granular perspective on clustering efficacy across various cluster configurations. The visually insightful plot enriches the analysis, aiding in the interpretation of clustering quality.\n\n### 8. Analysis of Variables in Clusters\n\nA sophisticated analysis of variables within clusters is meticulously executed. The code computes the squared correlation ratio (η²) for each variable, substantiating the findings through hypothesis testing.\n\n## Conclusion\n\nThis comprehensive clustering analysis aims to unravel intricate patterns and structures embedded within the dataset. The discerning insights garnered facilitate a profound comprehension of the underlying data dynamics.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsahar-dev%2Fclustering","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsahar-dev%2Fclustering","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsahar-dev%2Fclustering/lists"}