{"id":15442851,"url":"https://github.com/erdogant/clustimage","last_synced_at":"2025-09-08T00:32:09.288Z","repository":{"id":41226941,"uuid":"423822054","full_name":"erdogant/clustimage","owner":"erdogant","description":"clustimage is a python package for unsupervised clustering of images.","archived":false,"fork":false,"pushed_at":"2025-04-24T19:43:00.000Z","size":124016,"stargazers_count":107,"open_issues_count":6,"forks_count":8,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-09-04T18:56:04.595Z","etag":null,"topics":["clustering","image-analysis","image-processing","python3"],"latest_commit_sha":null,"homepage":"https://erdogant.github.io/clustimage","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/erdogant.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null},"funding":{"github":["erdogant"],"buy_me_a_coffee":"erdogant","ko_fi":"erdogant","custom":["https://erdogant.github.io/clustimage/pages/html/Documentation.html"],"patreon":null,"open_collective":null,"tidelift":null,"community_bridge":null,"liberapay":null,"issuehunt":null,"otechie":null,"lfx_crowdfunding":null}},"created_at":"2021-11-02T11:45:31.000Z","updated_at":"2025-08-08T01:45:43.000Z","dependencies_parsed_at":"2023-02-17T08:31:16.529Z","dependency_job_id":"cf555ca8-c599-46d7-ab87-a436c58cb551","html_url":"https://github.com/erdogant/clustimage","commit_stats":{"total_commits":393,"total_committers":2,"mean_commits":196.5,"dds":0.07633587786259544,"last_synced_commit":"59558109a81373bb9765c181a64a4492c62e349c"},"previous_names":[],"tags_count":76,"template":false,"template_full_name":null,"purl":"pkg:github/erdogant/clustimage","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/erdogant%2Fclustimage","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/erdogant%2Fclustimage/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/erdogant%2Fclustimage/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/erdogant%2Fclustimage/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/erdogant","download_url":"https://codeload.github.com/erdogant/clustimage/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/erdogant%2Fclustimage/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":274117107,"owners_count":25225098,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-07T02:00:09.463Z","response_time":67,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["clustering","image-analysis","image-processing","python3"],"created_at":"2024-10-01T19:30:46.205Z","updated_at":"2025-09-08T00:32:04.273Z","avatar_url":"https://github.com/erdogant.png","language":"Jupyter Notebook","funding_links":["https://github.com/sponsors/erdogant","https://buymeacoffee.com/erdogant","https://ko-fi.com/erdogant","https://erdogant.github.io/clustimage/pages/html/Documentation.html","https://www.buymeacoffee.com/erdogant)--"],"categories":[],"sub_categories":[],"readme":"# clustimage\n\n[![Python](https://img.shields.io/pypi/pyversions/clustimage)](https://img.shields.io/pypi/pyversions/clustimage)\n[![Pypi](https://img.shields.io/pypi/v/clustimage)](https://pypi.org/project/clustimage/)\n[![Docs](https://img.shields.io/badge/Sphinx-Docs-Green)](https://erdogant.github.io/clustimage/)\n[![LOC](https://sloc.xyz/github/erdogant/clustimage/?category=code)](https://github.com/erdogant/clustimage/)\n[![Downloads](https://static.pepy.tech/personalized-badge/clustimage?period=month\u0026units=international_system\u0026left_color=grey\u0026right_color=brightgreen\u0026left_text=PyPI%20downloads/month)](https://pepy.tech/project/clustimage)\n[![Downloads](https://static.pepy.tech/personalized-badge/clustimage?period=total\u0026units=international_system\u0026left_color=grey\u0026right_color=brightgreen\u0026left_text=Downloads)](https://pepy.tech/project/clustimage)\n[![License](https://img.shields.io/badge/license-MIT-green.svg)](https://github.com/erdogant/clustimage/blob/master/LICENSE)\n[![Forks](https://img.shields.io/github/forks/erdogant/clustimage.svg)](https://github.com/erdogant/clustimage/network)\n[![Issues](https://img.shields.io/github/issues/erdogant/clustimage.svg)](https://github.com/erdogant/clustimage/issues)\n[![Project Status](http://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active)\n[![DOI](https://zenodo.org/badge/423822054.svg)](https://zenodo.org/badge/latestdoi/423822054)\n[![Medium](https://img.shields.io/badge/Medium-Blog-blue)](https://erdogant.github.io/clustimage/pages/html/Documentation.html#)\n[![Colab](https://colab.research.google.com/assets/colab-badge.svg?logo=github%20sponsors)](https://erdogant.github.io/clustimage/pages/html/Documentation.html#colab-notebook)\n[![Donate](https://img.shields.io/badge/Support%20this%20project-grey.svg?logo=github%20sponsors)](https://erdogant.github.io/clustimage/pages/html/Documentation.html#)\n\u003c!---[![BuyMeCoffee](https://img.shields.io/badge/buymea-coffee-yellow.svg)](https://www.buymeacoffee.com/erdogant)--\u003e\n\u003c!---[![Coffee](https://img.shields.io/badge/coffee-black-grey.svg)](https://erdogant.github.io/donate/?currency=USD\u0026amount=5)--\u003e\n\n\nThe aim of ``clustimage`` is to detect natural groups or clusters of images. It works using a multi-step proces of carefully pre-processing the images, extracting the features, and evaluating the optimal number of clusters across the feature space.\nThe optimal number of clusters can be determined using well known methods suchs as *silhouette, dbindex, and derivatives* in combination with clustering methods, such as *agglomerative, kmeans, dbscan and hdbscan*.\nWith ``clustimage`` we aim to determine the most robust clustering by efficiently searching across the parameter and evaluation the clusters.\nBesides clustering of images, the ``clustimage`` model can also be used to find the most similar images for a new unseen sample.\n\nA schematic overview is as following:\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/schematic_overview.png\" width=\"1000\" /\u003e\n\u003c/p\u003e\n\n``clustimage`` overcomes the following challenges: \n\n    * 1. Robustly groups similar images.\n    * 2. Returns the unique images.\n    * 3. Finds higly similar images for a given input image.\n    * 4. Cluster on datetime or latlon coordinates when using photos.\n\n``clustimage`` is fun because:\n\n    * It does not require a learning proces.\n    * It can group any set of images.\n    * It can return only the unique() images.\n    * it can find highly similar images given an input image.\n    * it can map photos on an interactive map with thumbnails and clusterlabels so that you easily structure your photos.\n    * It provided many plots to improve understanding of the feature-space and sample-sample relationships\n    * It is build on core statistics, such as PCA, HOG, EXIF data and many more, and therefore it does not has a dependency block.\n    * It works out of the box.\n\n\n# \n**⭐️ Star this repo if you like it ⭐️**\n#\n\n### Blogs\n\n* Read the [blog](https://towardsdatascience.com/a-step-by-step-guide-for-clustering-images-4b45f9906128) to get a structured overview how to cluster images.\n\n# \n\n### [Documentation pages](https://erdogant.github.io/clustimage/)\n\nOn the [documentation pages](https://erdogant.github.io/clustimage/) you can find detailed information about the working of the ``clustimage`` with many examples. \n\n# \n\n\n### Installation\n\n##### It is advisable to create a new environment (e.g. with Conda). \n```bash\nconda create -n env_clustimage python=3.8\nconda activate env_clustimage\n```\n\n##### Install bnlearn from PyPI\n```bash\npip install clustimage            # new install\npip install -U clustimage         # update to latest version\n```\n\n##### Directly install from github source\n```bash\npip install git+https://github.com/erdogant/clustimage\n```  \n\n##### Import clustimage package\n\n```python\nfrom clustimage import clustimage\n```\n\n\u003chr\u003e\n\n### Examples\n\nThe results obtained from the clustimgage library is a dictionary containing the following keys:\n\n    * img       : image vector of the preprocessed images\n    * feat      : Features extracted for the images\n    * xycoord   : X and Y coordinates from the embedding\n    * pathnames : Absolute path location to the image file\n    * filenames : File names of the image file\n    * labels    : Cluster labels\n\n\n### Examples Mnist dataset:\n\n##### [Example: Clustering mnist dataset](https://erdogant.github.io/clustimage/pages/html/Examples.html#)\n\nIn this example we will be using a flattened grayscale image array loaded from sklearn. The unique detected clusters are the following:\n\n\u003cp align=\"left\"\u003e\n  \u003ca href=\"https://erdogant.github.io/clustimage/pages/html/Examples.html#scatter-plot\"\u003e\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/digits_fig2_tsne.png\" width=\"400\" /\u003e\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/digits_fig21_tsne.png\" width=\"400\" /\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n**Click on the underneath scatterplot to zoom-in and see ALL the images in the scatterplot**\n\n\u003cp align=\"left\"\u003e\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/scatter_mnist_all.png\" width=\"400\" /\u003e\n\u003c/p\u003e\n\n\n#\n\n##### [Example: Plot the explained variance](https://erdogant.github.io/clustimage/pages/html/Examples.html#cluster-evaluation)\n\n\u003cp align=\"left\"\u003e\n  \u003ca href=\"https://erdogant.github.io/clustimage/pages/html/Examples.html#cluster-evaluation\"\u003e\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/digits_explained_var.png\" width=\"400\" /\u003e\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/digits_clusters.png\" width=\"400\" /\u003e\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/digits_fig1.png\" width=\"600\" /\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n#\n\n##### [Example: Plot the unique images](https://erdogant.github.io/clustimage/pages/html/Examples.html#detect-unique-images)\n\n\n\u003cp align=\"left\"\u003e\n  \u003ca href=\"https://erdogant.github.io/clustimage/pages/html/Examples.html#detect-unique-images\"\u003e\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/digits_unique.png\" width=\"300\" /\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n\n#\n\n\n##### [Example: Plot the dendrogram](https://erdogant.github.io/clustimage/pages/html/Examples.html#dendrogram)\n\n\u003cp align=\"left\"\u003e\n  \u003ca href=\"https://erdogant.github.io/clustimage/pages/html/Examples.html#dendrogram\"\u003e\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/digits_dendrogram.png\" width=\"400\" /\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n\n\n\u003chr\u003e \n\n\n### Examples Flower dataset:\n\n##### [Example: cluster the flower dataset](https://erdogant.github.io/clustimage/pages/html/Examples.html#id5)\n\n\u003cp align=\"left\"\u003e\n  \u003ca href=\"https://erdogant.github.io/clustimage/pages/html/Examples.html#id5\"\u003e\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/flowers_sil_vs_nrclusters.png\" width=\"400\" /\u003e\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/flowers_silhouette.png\" width=\"400\" /\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n\n##### [Example: Make scatterplot with clusterlabels](https://erdogant.github.io/clustimage/pages/html/Examples.html#id7)\n\n\u003cp align=\"left\"\u003e\n  \u003ca href=\"https://erdogant.github.io/clustimage/pages/html/Examples.html#id7\"\u003e\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/flowers_scatter.png\" width=\"300\" /\u003e\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/flowers_scatter_imgs_mean.png\" width=\"300\" /\u003e\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/flowers_scatter_imgs.png\" width=\"300\" /\u003e\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/flowers_predict_scatter_all.png\" width=\"300\" /\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n\n##### [Example: Plot the unique images per cluster](https://erdogant.github.io/clustimage/pages/html/Examples.html#id6)\n\n\u003cp align=\"left\"\u003e\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/flowers_unique.png\" width=\"400\" /\u003e\n\u003c/p\u003e\n\n\u003cp align=\"left\"\u003e\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/flowers_unique_mean.png\" width=\"400\" /\u003e\n\u003c/p\u003e\n\n\n##### [Example: Plot the images in a particular cluster](https://erdogant.github.io/clustimage/pages/html/Examples.html#id8)\n\n\u003cp align=\"left\"\u003e\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/flowers_cluster3.png\" width=\"400\" /\u003e\n\u003c/p\u003e\n\n\n\n\n##### [Example: Make prediction for unseen input image](https://erdogant.github.io/clustimage/pages/html/Examples.html#predict-unseen-sample)\n\n\u003cp align=\"left\"\u003e\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/flowers_predict_1.png\" width=\"600\" /\u003e\n\u003c/p\u003e\n\u003cp align=\"left\"\u003e\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/flowers_predict_2.png\" width=\"600\" /\u003e\n\u003c/p\u003e\n\u003cp align=\"left\"\u003e\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/flowers_predict_scatter.png\" width=\"600\" /\u003e\n\u003c/p\u003e\n\n\n\u003chr\u003e \n\n\n#### [Example: Clustering of faces on images](https://erdogant.github.io/clustimage/pages/html/Examples.html#clustering-of-faces)\n\n\n\u003cp align=\"center\"\u003e\n\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/faces_sil_vs_nrclusters.png\" width=\"400\" /\u003e\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/faces_set_max_clust.png\" width=\"400\" /\u003e\n\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/faces_unique.png\" width=\"400\" /\u003e\n\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/faces_scatter_no_img.png\" width=\"400\" /\u003e\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/faces_scatter.png\" width=\"400\" /\u003e\n\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/faces_cluster0.png\" width=\"400\" /\u003e\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/faces_cluster3.png\" width=\"400\" /\u003e\n\n  \u003cimg src=\"https://github.com/erdogant/clustimage/blob/main/docs/figs/faces1.png\" width=\"400\" /\u003e\n\u003c/p\u003e\n\n\u003chr\u003e\n\n#### [Example: Break up the steps](https://erdogant.github.io/clustimage/pages/html/Examples.html#breaking-up-the-steps)\n\n\u003chr\u003e\n\n#### [Example: Extract images belonging to clusters](https://erdogant.github.io/clustimage/pages/html/Examples.html#extract-images-belonging-to-clusters)\n\n\u003chr\u003e\n\n\n### Support\n\n\tThis project needs some love! ❤️ You can help in various ways.\n\n\t* Become a Sponsor!\n\t* Star this repo at the github page.\n\t* Other contributions can be in the form of feature requests, idea discussions, reporting bugs, opening pull requests.\n\t* Read more why becoming an sponsor is important on the Sponsor Github Page.\n\t\n\tCheers Mate.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ferdogant%2Fclustimage","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ferdogant%2Fclustimage","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ferdogant%2Fclustimage/lists"}