{"id":13935914,"url":"https://github.com/volfpeter/localclustering","last_synced_at":"2025-05-07T13:45:01.566Z","repository":{"id":62576549,"uuid":"106163018","full_name":"volfpeter/localclustering","owner":"volfpeter","description":"Python 3 implementation and documentation of the Hermina-Janos local graph clustering algorithm.","archived":false,"fork":false,"pushed_at":"2023-01-22T10:03:15.000Z","size":2601,"stargazers_count":21,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-02-26T01:41:25.069Z","etag":null,"topics":["cluster","cluster-analysis","clustering","clustering-algorithm","graph-algorithms","graph-theory","hierarchical-clustering","local-clustering","python","python3","ranking","social-network-analysis"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/volfpeter.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-10-08T09:04:20.000Z","updated_at":"2024-11-05T09:25:21.000Z","dependencies_parsed_at":"2023-02-12T15:31:10.603Z","dependency_job_id":null,"html_url":"https://github.com/volfpeter/localclustering","commit_stats":null,"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/volfpeter%2Flocalclustering","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/volfpeter%2Flocalclustering/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/volfpeter%2Flocalclustering/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/volfpeter%2Flocalclustering/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/volfpeter","download_url":"https://codeload.github.com/volfpeter/localclustering/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":242980588,"owners_count":20216283,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cluster","cluster-analysis","clustering","clustering-algorithm","graph-algorithms","graph-theory","hierarchical-clustering","local-clustering","python","python3","ranking","social-network-analysis"],"created_at":"2024-08-07T23:02:11.829Z","updated_at":"2025-03-11T05:32:06.927Z","avatar_url":"https://github.com/volfpeter.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"[![DOI](http://joss.theoj.org/papers/10.21105/joss.00960/status.svg)](https://doi.org/10.21105/joss.00960) [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.1443550.svg)](https://doi.org/10.5281/zenodo.1443550)\n[![Downloads](https://pepy.tech/badge/localclustering)](https://pepy.tech/project/localclustering)\n\n# LocalClustering\n\nThe project implements multiple variations of a *local* graph clustering algorithm named the *Hermina-Janos algorithm* in memory of my beloved grandparents.\n\nGraph cluster analysis is used in a wide variety of fields. This project does not target one specific field, instead it aims to be a general tool for graph cluster analysis for cases where global cluster analysis is not applicable or practical for example because of the size of the data set or because a different (local) perspective is required.\n\nThe algorithms are independent of the cluster definition. The interface cluster definitions must implement can be found in the `definitions` package along with a simple connectivity based cluster definition implementation. Besides the algorithms and the cluster definition, other utilities are also provided, most notably a module for node `ranking`.\n\n## Installation\n\nInstall the latest version of the project from the Python Package Index using `pip install localclustering`.\n\n## Getting started\n\nThis section will guide you through the basics using `SQLAlchemy` and the `IGraphWrapper` graph implementation from `graphscraper`. `IGraphWrapper` requires the `igraph` project to be installed. You can do this by following the instructions on [this page](http://igraph.org/python/).\n\nOnce everything is in place, the analyzed graph can be created:\n\n```Python\nimport igraph\nfrom graphscraper.igraphwrapper import IGraphWrapper\n\ngraph = IGraphWrapper(igraph.Graph.Famous(\"Zachary\"))\n```\n\nThe next step is the creation of the cluster definition and the preparation of the clustering algorithm:\n\n```Python\nfrom localclustering.definitions.connectivity import ConnectivityClusterDefinition\nfrom localclustering.localengine import LocalClusterEngine\n\ncluster_definition = ConnectivityClusterDefinition(1.5, 0.85)\nlocal_cluster_engine = LocalClusterEngine(\n    cluster_definition,  # The cluster definition the algorithm should use.\n    source_nodes_in_result=True,  # Ensure that source nodes are not removed from the cluster.\n    max_cluster_size=34  # Specify an upper limit for the calculated cluster's size.\n)\n```\n\nNow the source node of the clustering must be retrieved:\n\n```Python\nsource_node = graph.nodes.get_node_by_name(\"2\", can_validate_and_load=True)\n```\n\nAnd finally the cluster analysis can be executed:\n\n```Python\ncluster = local_cluster_engine.cluster([source_node])\n```\n\nAdditionally you can list the nodes inside the cluster with their rank to get an overview of the result:\n\n```Python\nrank_provider = local_cluster_engine.get_rank_provider()\nfor node in cluster.nodes:\n    print(node.igraph_index, rank_provider.get_node_rank(node))\n```\n\n![Example visualization of the result: the source node is diamond shaped, red nodes are part of the cluster, light blue nodes mark the neighborhood of the cluster, and the size of nodes correspond to their rank.](documents/Zachary_2.png \"Example visualization of the result: the source node is diamond shaped, red nodes are part of the cluster, light blue nodes mark the neighborhood of the cluster, and the size of nodes correspond to their rank.\")\n\n## Additional resources\n\nIn addition to the software, a detailed [description](documents/algorithm.rst) and an in-depth [evaluation](documents/Algorithm%20Analysis%20with%20the%20Spotify%20Related%20Artists%20Graph.ipynb) of the algorithms is also provided.\n\nFurthermore, a `demo` module showing the basic usage of the project is also available.\n\n## Related projects\n\nYou can find related projects here:\n\n- [graphscraper](https://github.com/volfpeter/graphscraper)\n\n## Community guidelines\n\nAny form of constructive contribution is welcome:\n\n- Questions, feedback, bug reports: please open an issue in the issue tracker of the project or contact the repository owner by email, whichever you feel appropriate.\n- Contribution to the software: please open an issue in the issue tracker of the project that describes the changes you would like to make to the software and open a pull request with the changes. The description of the pull request must references the corresponding issue.\n\nThe following types of contribution are especially appreciated:\n\n- Implementation of new cluster definitions.\n- Result comparison with global clustering algorithms on well-known and -analyzed graphs.\n- Analysis of how cluster definitions should be configured for graphs with different characteristics.\n- Analysis of how the weighting coefficients of the connectivity based cluster definition corresponding to the different hierarchy levels relate to each-other in different real-world graphs.\n\n## License - GNU AGPLv3\n\nThe library is open-sourced under the conditions of the GNU Affero General Public [License](https://choosealicense.com/licenses/agpl-3.0/) v3.0, which is the strongest copyleft license. The reason for using this license is that this library is the \"publication\" of the *Hermina-Janos algorithm* and it should be referenced accordingly.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvolfpeter%2Flocalclustering","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvolfpeter%2Flocalclustering","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvolfpeter%2Flocalclustering/lists"}