{"id":19787427,"url":"https://github.com/adriacabeza/graphclustering","last_synced_at":"2026-06-10T12:31:59.367Z","repository":{"id":40974050,"uuid":"220512307","full_name":"adriacabeza/GraphClustering","owner":"adriacabeza","description":":milky_way: Method to partition large networks into communities","archived":false,"fork":false,"pushed_at":"2023-10-03T21:39:03.000Z","size":127152,"stargazers_count":2,"open_issues_count":4,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-01-11T03:41:52.295Z","etag":null,"topics":["clustering-methods","graphs","large-network","python3","spectral-clustering"],"latest_commit_sha":null,"homepage":"","language":"TeX","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/adriacabeza.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-11-08T17:04:47.000Z","updated_at":"2024-04-26T13:59:28.000Z","dependencies_parsed_at":"2022-08-29T04:41:49.244Z","dependency_job_id":null,"html_url":"https://github.com/adriacabeza/GraphClustering","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/adriacabeza%2FGraphClustering","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/adriacabeza%2FGraphClustering/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/adriacabeza%2FGraphClustering/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/adriacabeza%2FGraphClustering/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/adriacabeza","download_url":"https://codeload.github.com/adriacabeza/GraphClustering/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241120209,"owners_count":19913019,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["clustering-methods","graphs","large-network","python3","spectral-clustering"],"created_at":"2024-11-12T06:22:59.456Z","updated_at":"2026-06-10T12:31:59.353Z","avatar_url":"https://github.com/adriacabeza.png","language":"TeX","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ch1 align=\"center\"\u003e:milky_way: Graph Clustering into communities \u003c/h1\u003e\n\n[![HitCount](http://hits.dwyl.io/adriacabeza/object-cut.svg)](http://hits.dwyl.io/AlbertSuarez/GraphClustering)\n[![contributions welcome](https://img.shields.io/badge/contributions-welcome-brightgreen.svg?style=flat)](https://github.com/adriacabeza/GraphClustering)\n[![made-with-python](https://img.shields.io/badge/Made%20with-Python-1f425f.svg)](https://www.python.org/)\n[![GitHub stars](https://img.shields.io/github/stars/adriacabeza/GraphClustering.svg)](https://GitHub.com/adriacabeza/GraphClustering/stargazers/)\n\n\n\n\nWe will be using the following graphs from the Stanford Network Analysis Project (SNAP): ca-GrQc, Oregon-1, roadNet-CA, soc-Epinions1, and web-NotreDame (http://snap.stanford.edu/data/index.html). Project description in [*project.pdf*](./project.pdf) and final report in [*report.pdf*](./report/report.pdf). \n\n## Initial example visualization and clustering of the graph ca-GrQc\n\u003cp float=\"center\"\u003e\n  \u003cimg src=\"docs/images/ca-GrQc_kamada_kawai_graph_colormap2clusters.png\" width=\"405\"/\u003e\n  \u003cimg src=\"docs/images/ca-GrQcSpectralClustering2D.png\" width=\"390\"/\u003e \n\u003c/p\u003e\n\u003cp align=\"center\"\u003e Kamada-Kawai graph visualization of the ca-GrQc graph and Clustering using the Spectral Embedding. \u003c/p\u003e\n\n## Statistics of graph datasets\n| Graph         | #vertices | #edges  | #clusters |\n|---------------|-----------|---------|-----------|\n| ca-GrQc       | 4158      | 13428   | 2         |\n| Oregon-1      | 10670     | 22002   | 5         |\n| soc-Epinions1 | 75877     | 405739  | 10        |\n| web-NotreDame | 325729    | 1117563 | 20        |\n| roadNet-CA    | 1957027   | 2760388 | 50        |\n \n## Run it\n\n### Requirements\nPython 3 and install dependencies:\n```bash\npip install -r requirements.txt\n```\n\n### Recommendations\nUsage of [virtualenv](https://realpython.com/blog/python/python-virtual-environments-a-primer/) is recommended for package library / runtime isolation.\n\n### Usage\nRun the clustering algorithm from the main Python file *graph_clustering.py*. You can read arguments help and find command examples in *EXPERIMENTS.sh*. List of arguments:\n\n- *seed*: Random seed.\n- *iterations*: Number of iterations with different seed.\n- *file*: Path of the input graph file.\n- *outputs_path*: Path to save the outputs.\n- *clustering*: Use \"kmeans\", \"custom_kmeans\", \"kmeans_sklearn\", \"xmeans\" or \"agglomerative\".\n- *random_centroids*: Random centroids initialization for \"custom_kmeans\".\n- *distance_metric*: Distance metric for \"custom_kmeans\": \"MINKOWSKI\", \"CHEBYSHEV\", \"EUCLIDEAN\".\n- *compute_eig*: Compute eigenvectors or load them.\n- *k*: Number of desired clusters.\n- *networkx*: Use networkx library for Laplacian.\n- *eig_kept*: Number of eigen vectors kept.\n- *normalize_laplacian*: Normalize Laplacian.\n- *invert_laplacian*: Invert Laplacian.\n- *second*: Using only second smallest eigenvector.\n- *eig_normalization*: Normalization of eigen vectors by \"vertex\", \"eig\" or \"None\".\n\n## Authors\n\n👤 Álvaro Orgaz Expósito ([alvarorgaz](https://github.com/alvarorgaz))\n\n👤 Adrià Cabeza ([adriacabeza](https://github.com/adriacabeza))\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fadriacabeza%2Fgraphclustering","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fadriacabeza%2Fgraphclustering","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fadriacabeza%2Fgraphclustering/lists"}