{"id":13688990,"url":"https://github.com/cvxgrp/pymde","last_synced_at":"2025-05-16T01:07:37.864Z","repository":{"id":38410731,"uuid":"316872250","full_name":"cvxgrp/pymde","owner":"cvxgrp","description":"Minimum-distortion embedding with PyTorch","archived":false,"fork":false,"pushed_at":"2025-04-28T06:32:19.000Z","size":49126,"stargazers_count":546,"open_issues_count":29,"forks_count":26,"subscribers_count":9,"default_branch":"main","last_synced_at":"2025-04-28T06:33:01.815Z","etag":null,"topics":["cuda","dimensionality-reduction","embedding","feature-vectors","gpu","graph-embedding","machine-learning","pytorch","visualization"],"latest_commit_sha":null,"homepage":"https://pymde.org","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cvxgrp.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-11-29T04:01:10.000Z","updated_at":"2025-04-18T11:15:31.000Z","dependencies_parsed_at":"2024-06-18T16:58:36.595Z","dependency_job_id":null,"html_url":"https://github.com/cvxgrp/pymde","commit_stats":{"total_commits":155,"total_committers":11,"mean_commits":"14.090909090909092","dds":"0.34838709677419355","last_synced_commit":"073251ed8f299de1f6af5fd3e6e73affad9009dd"},"previous_names":[],"tags_count":19,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cvxgrp%2Fpymde","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cvxgrp%2Fpymde/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cvxgrp%2Fpymde/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cvxgrp%2Fpymde/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cvxgrp","download_url":"https://codeload.github.com/cvxgrp/pymde/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254448579,"owners_count":22072764,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cuda","dimensionality-reduction","embedding","feature-vectors","gpu","graph-embedding","machine-learning","pytorch","visualization"],"created_at":"2024-08-02T15:01:29.680Z","updated_at":"2025-05-16T01:07:32.853Z","avatar_url":"https://github.com/cvxgrp.png","language":"Python","readme":"# PyMDE\n![](https://github.com/cvxgrp/pymde/workflows/Test/badge.svg) ![](https://github.com/cvxgrp/pymde/workflows/Deploy/badge.svg) [![PyPI version](https://badge.fury.io/py/pymde.svg)](https://pypi.org/project/pymde/) [![Conda Version](https://img.shields.io/conda/vn/conda-forge/pymde.svg)](https://anaconda.org/conda-forge/pymde)\n\n*The official documentation for PyMDE is available at www.pymde.org.*\n\nThis repository accompanies the monograph [*Minimum-Distortion Embedding*](https://web.stanford.edu/~boyd/papers/min_dist_emb.html).\n\nPyMDE is a Python library for computing vector embeddings for finite sets of\nitems, such as images, biological cells, nodes in a network, or any other\nabstract object.\n\nWhat sets PyMDE apart from other embedding libraries is that it provides a\nsimple but general framework for embedding, called _Minimum-Distortion\nEmbedding_ (MDE). With MDE, it is easy to recreate well-known embeddings and to\ncreate new ones, tailored to your particular application.\n\nPyMDE is competitive\nin runtime with more specialized embedding methods. With a GPU, it can be\neven faster.\n\n## Overview\nPyMDE can be enjoyed by beginners and experts alike. It can be used to:\n\n* visualize datasets, small or large;\n* generate feature vectors for supervised learning;\n* compress high-dimensional vector data;\n* draw graphs (in up to orders of magnitude less time than packages like NetworkX);\n* create custom embeddings, with custom objective functions and constraints (such as having uncorrelated feature columns);\n* and more.\n\nPyMDE is very young software, under active development. If you run into issues,\nor have any feedback, please reach out by [filing a Github\nissue](https://github.com/cvxgrp/pymde/issues).\n\nThis README gives a very brief overview of PyMDE. Make sure to read the \nofficial documentation at www.pymde.org, which has in-depth tutorials\nand API documentation.\n\n- [Installation](#installation)\n- [Getting started](#getting-started)\n- [Example notebooks](#example-notebooks)\n- [Citing](#citing)\n\n## Installation\nPyMDE is available on the Python Package Index, and on Conda Forge.\n\nTo install with pip, use\n\n```\npip install pymde\n```\n\nAlternatively, to install with conda, use\n\n```\nconda install -c pytorch -c conda-forge pymde\n```\n\nPyMDE has the following requirements:\n\n* Python \u003e= 3.7\n* numpy \u003e= 1.17.5\n* scipy\n* torch \u003e= 1.7.1\n* torchvision \u003e= 0.8.2\n* pynndescent\n* requests\n\n## Getting started\nGetting started with PyMDE is easy. For embeddings that work out-of-the box, we provide two main functions:\n\n```python3\npymde.preserve_neighbors\n```\n\nwhich preserves the local structure of original data, and \n\n```python3\npymde.preserve_distances\n```\n\nwhich preserves pairwise distances or dissimilarity scores in the original\ndata.\n\n**Arguments.** The input to these functions is the original data, represented\neither as a data matrix in which each row is a feature vector, or as a\n(possibly sparse) graph encoding pairwise distances. The embedding dimension is\nspecified by the `embedding_dim` keyword argument, which is `2` by default.\n\n**Return value.** The return value is an `MDE` object. Calling the `embed()`\nmethod on this object returns an embedding, which is a matrix\n(`torch.Tensor`) in which each row is an embedding vector. For example, if the\noriginal input is a data matrix of shape `(n_items, n_features)`, then the\nembedding matrix has shape `(n_items, embeddimg_dim)`.\n\nWe give examples of using these functions below. \n\n### Preserving neighbors\nThe following code produces an embedding of the MNIST dataset (images of\nhandwritten digits), in a fashion similar to LargeVis, t-SNE, UMAP, and other\nneighborhood-based embeddings. The original data is a matrix of shape `(70000,\n784)`, with each row representing an image.\n\n```python3\nimport pymde\n\nmnist = pymde.datasets.MNIST()\nembedding = pymde.preserve_neighbors(mnist.data, verbose=True).embed()\npymde.plot(embedding, color_by=mnist.attributes['digits'])\n```\n\n![](https://github.com/cvxgrp/pymde/blob/main/images/mnist.png?raw=true)\n\nUnlike most other embedding methods, PyMDE can compute embeddings that satisfy\nconstraints. For example:\n\n```python3\nembedding = pymde.preserve_neighbors(mnist.data, constraint=pymde.Standardized(), verbose=True).embed()\npymde.plot(embedding, color_by=mnist.attributes['digits'])\n```\n\n![](https://github.com/cvxgrp/pymde/blob/main/images/mnist_std.png?raw=true)\n\nThe standardization constraint enforces the embedding vectors to be centered\nand have uncorrelated features.\n\n\n### Preserving distances\nThe function `pymde.preserve_distances` is useful when you're more interested\nin preserving the gross global structure instead of local structure. \n\nHere's an example that produces an embedding of an academic coauthorship\nnetwork, from Google Scholar. The original data is a sparse graph on roughly\n40,000 authors, with an edge between authors who have collaborated on at least\none paper.\n\n```python3\nimport pymde\n\ngoogle_scholar = pymde.datasets.google_scholar()\nembedding = pymde.preserve_distances(google_scholar.data, verbose=True).embed()\npymde.plot(embedding, color_by=google_scholar.attributes['coauthors'], color_map='viridis', background_color='black')\n```\n\n![](https://github.com/cvxgrp/pymde/blob/main/images/scholar.jpg?raw=true)\n\nMore collaborative authors are colored brighter, and are near the center of the\nembedding.\n\n\n## Example notebooks\nWe have several [example notebooks](https://github.com/cvxgrp/pymde/tree/main/examples) that show how to use PyMDE on real (and synthetic) datasets.\n\n## Citing\nTo cite our work, please use the following BibTex entry.\n\n```\n@article{agrawal2021minimum,\n  author  = {Agrawal, Akshay and Ali, Alnur and Boyd, Stephen},\n  title   = {Minimum-Distortion Embedding},\n  journal = {arXiv},\n  year    = {2021},\n}\n```\n\nPyMDE was designed and developed by [Akshay Agrawal](https://www.akshayagrawal.com/).\n","funding_links":[],"categories":["Python"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcvxgrp%2Fpymde","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcvxgrp%2Fpymde","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcvxgrp%2Fpymde/lists"}