{"id":20243583,"url":"https://github.com/pleonard212/pix-plot","last_synced_at":"2025-07-04T20:02:26.262Z","repository":{"id":37664199,"uuid":"99263018","full_name":"pleonard212/pix-plot","owner":"pleonard212","description":"A WebGL viewer for UMAP or TSNE-clustered images","archived":false,"fork":false,"pushed_at":"2023-04-15T22:15:45.000Z","size":7822,"stargazers_count":618,"open_issues_count":37,"forks_count":140,"subscribers_count":20,"default_branch":"master","last_synced_at":"2025-06-29T20:03:40.159Z","etag":null,"topics":["data-visualization","machine-vision","visual-culture","web-app","webgl"],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pleonard212.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2017-08-03T18:25:19.000Z","updated_at":"2025-06-29T19:36:58.000Z","dependencies_parsed_at":"2022-08-08T21:15:34.734Z","dependency_job_id":"f3c39e55-7976-41a0-a1d0-3a4adccea505","html_url":"https://github.com/pleonard212/pix-plot","commit_stats":{"total_commits":658,"total_committers":14,"mean_commits":47.0,"dds":"0.10182370820668696","last_synced_commit":"48c8e6b3732ada3291ee2f6ebffce934fe4faee2"},"previous_names":["glevyhas/pix-plot","yaledhlab/pix-plot","pleonard212/pix-plot"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/pleonard212/pix-plot","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pleonard212%2Fpix-plot","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pleonard212%2Fpix-plot/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pleonard212%2Fpix-plot/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pleonard212%2Fpix-plot/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pleonard212","download_url":"https://codeload.github.com/pleonard212/pix-plot/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pleonard212%2Fpix-plot/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":262950293,"owners_count":23389638,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-visualization","machine-vision","visual-culture","web-app","webgl"],"created_at":"2024-11-14T09:01:54.112Z","updated_at":"2025-07-04T20:02:26.209Z","avatar_url":"https://github.com/pleonard212.png","language":"JavaScript","readme":"# PixPlot\n\nThis repository contains code that can be used to visualize tens of thousands of images in a two-dimensional projection within which similar images are clustered together. The image analysis uses Tensorflow's Inception bindings, and the visualization layer uses a custom WebGL viewer.\n\nSee the [change log](https://github.com/YaleDHLab/pix-plot/wiki/Change-Log) for recent updates.\n\n![App preview](./pixplot/web/assets/images/preview.png?raw=true)\n\n## Installation \u0026 Dependencies\n\nWe maintain several platform-specific [installation cookbooks](https://github.com/YaleDHLab/pix-plot/wiki) online.\n\nBroadly speaking, to install the Python dependencies, we recommend you [install Anaconda](https://www.anaconda.com/products/individual#Downloads) and then create a conda environment with a Python 3.7 runtime:\n\n```bash\nconda create --name=3.7 python=3.7\nsource activate 3.7\n```\n\nThen you can install the dependencies by running:\n\n```\nbash\npip install https://github.com/yaledhlab/pix-plot/archive/master.zip\n```\n\nThe website that PixPlot eventually creates requires a WebGL-enabled browser.\n\n## Quickstart\n\nIf you have a WebGL-enabled browser and a directory full of images to process, you can prepare the data for the viewer by installing the dependencies above then running:\n\n```bash\npixplot --images \"path/to/images/*.jpg\"\n```\n\nTo see the results of this process, you can start a web server by running:\n\n```bash\n# for python 3.x\npython -m http.server 5000\n\n# for python 2.x\npython -m SimpleHTTPServer 5000\n```\n\nThe visualization will then be available at `http://localhost:5000/output`.\n\n## Sample Data\n\nTo acquire some sample data with which to build a plot, feel free to use some data prepared by Yale's DHLab:\n\n```bash\npip install image_datasets\n```\n\nThen in a Python script:\n\n```python\nimport image_datasets\nimage_datasets.oslomini.download()\n```\n\nThe `.download()` command will make a directory named `datasets` in your current working directory. That `datasets` directory will contain a subdirectory named 'oslomini', which contains a directory of images and another directory with a CSV file of image metadata. Using that data, we can next build a plot:\n\n```bash\npixplot --images \"datasets/oslomini/images/*\" --metadata \"datasets/oslomini/metadata/metadata.csv\"\n```\n\n## Creating Massive Plots\n\nIf you need to plot more than 100,000 images but don't have an expensive graphics card with which to visualize huge WebGL displays, you might want to specify a smaller \"cell_size\" parameter when building your plot. The \"cell_size\" argument controls how large each image is in the atlas files; smaller values require fewer textures to be rendered, which decreases the GPU RAM required to view a plot:\n\n```bash\npixplot --images \"path/to/images/*.jpg\" --cell_size 10\n```\n\n## Controlling UMAP Layout\n\nThe [UMAP algorithm](https://github.com/lmcinnes/umap) is particularly sensitive to three hyperparemeters:\n\n```\n--min_dist: determines the minimum distance between points in the embedding\n--n_neighbors: determines the tradeoff between local and global clusters\n--metric: determines the distance metric to use when positioning points\n```\n\nUMAP's creator, Leland McInnes, has written up a [helpful overview of these hyperparameters](https://umap-learn.readthedocs.io/en/latest/parameters.html). To specify the value for one or more of these hyperparameters when building a plot, one may use the flags above, e.g.:\n\n```bash\npixplot --images \"path/to/images/*.jpg\" --n_neighbors 2\n```\n\n## Curating Automatic Hotspots\n\nIf installed and available, PixPlot uses [Hierarchical density-based spatial clustering of applications with noise](https://hdbscan.readthedocs.io/en/latest/index.html), a refinement of the earlier [DBSCAN](https://en.wikipedia.org/wiki/DBSCAN) algorithm, to find hotspots in the visualization. You may be interested in consulting this [explanation of how HDBSCAN works](https://hdbscan.readthedocs.io/en/latest/how_hdbscan_works.html).\n\nTip: If you are using HDBSCAN and find that PixPlot creates too few (or only one) 'automatic hotspots', try lowering the `--min_cluster_size` from its default of 20. This often happens with smaller datasets (less than a few thousand.)\n\nIf HDBSCAN is not available, PixPlot will fall back to [scikit-learn](https://scikit-learn.org/)'s  implementation of [KMeans](https://scikit-learn.org/stable/modules/clustering.html#k-means).\n\n\n## Adding Metadata\n\nIf you have metadata associated with each of your images, you can pass in that metadata when running the data processing script. Doing so will allow the PixPlot viewer to display the metadata associated with an image when a user clicks on that image.\n\nTo specify the metadata for your image collection, you can add ` --metadata=path/to/metadata.csv` to the command you use to call the processing script. For example, you might specify:\n\n```bash\npixplot --images \"path/to/images/*.jpg\" --metadata \"path/to/metadata.csv\"\n```\n\nMetadata should be in a comma-separated value file, should contain one row for each input image, and should contain headers specifying the column order. Here is a sample metadata file:\n\n| filename | category  | tags    | description   | permalink   | Year     |\n| -------- | --------- | ------- | ------------- | ----------- | -------- |\n| bees.jpg | yellow    | a\\|b\\|c | bees' knees   | https://... | 1776     |\n| cats.jpg | dangerous | b\\|c\\|d | cats' pajamas | https://... | 1972     |\n\nThe following column labels are accepted:\n\n| *Column*         | *Description*                                           |\n| ---------------- | ------------------------------------------------------- |\n| **filename**     | the filename of the image                               |\n| **category**     | a categorical label for the image                       |\n| **tags**         | a pipe-delimited list of categorical tags for the image |\n| **description**  | a plaintext description of the image's contents         |\n| **permalink**    | a link to the image hosted on another domain            |\n| **year**         | a year timestamp for the image (should be an integer)   |\n| **label**        | a categorical label used for supervised UMAP projection |\n| **lat**          | the latitudinal position of the image                   |\n| **lng**          | the longitudinal position of the image                  |\n\n## IIIF Images\n\nIf you would like to process images that are hosted on a IIIF server, you can specify a newline-delimited list of IIIF image manifests as the `--images` argument. For example, the following could be saved as `manifest.txt`:\n\n```bash\nhttps://manifests.britishart.yale.edu/manifest/40005\nhttps://manifests.britishart.yale.edu/manifest/40006\nhttps://manifests.britishart.yale.edu/manifest/40007\nhttps://manifests.britishart.yale.edu/manifest/40008\nhttps://manifests.britishart.yale.edu/manifest/40009\n```\n\nOne could then specify these images as input by running `pixplot --images manifest.txt --n_clusters 2`\n\n\n## Demonstrations (Developed with PixPlot 2.0 codebase)\n\n| Link | Image Count | Collection Info | Browse Images | Download for PixPlot\n| ---------- | -------- | --------------- | ------------ | ------------ |\n| [NewsPlot: 1910-1912](http://pixplot.yale.edu/v2/loc/) | 24,026 | [George Grantham Bain Collection](https://www.loc.gov/pictures/collection/ggbain/) | [News in the 1910s](https://www.flickr.com/photos/library_of_congress/albums/72157603624867509/with/2163445674/) | [Images](http://pixplot.yale.edu/datasets/bain/photos.tar), [Metadata](http://pixplot.yale.edu/datasets/bain/metadata.csv) |\n| [Bildefelt i Oslo](http://pixplot.yale.edu/v2/oslo/) | 31,097 | [oslobilder](http://oslobilder.no) | [Advanced search, 1860-1924](http://oslobilder.no/search?advanced_search=1\u0026query=\u0026place=\u0026from_year=1860\u0026to_year=1924\u0026id=\u0026name=\u0026title=\u0026owner_filter=\u0026producer=\u0026depicted_person=\u0026material=\u0026technique=\u0026event_desc=) | [Images](http://pixplot.yale.edu/datasets/oslo/photos.tar), [Metadata](http://pixplot.yale.edu/datasets/oslo/metadata.csv) |\n\n## Acknowledgements\n\nThe DHLab would like to thank [Cyril Diagne](http://cyrildiagne.com/) and [Nicolas Barradeau](http://barradeau.com), lead developers of the spectacular [Google Arts Experiments TSNE viewer](https://artsexperiments.withgoogle.com/tsnemap/), for generously sharing ideas on optimization techniques used in this viewer, and [Lillianna Marie](https://github.com/lilliannamarie) for naming this viewer PixPlot.\n","funding_links":[],"categories":["JavaScript"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpleonard212%2Fpix-plot","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpleonard212%2Fpix-plot","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpleonard212%2Fpix-plot/lists"}