{"id":13737912,"url":"https://github.com/minimaxir/imgbeddings","last_synced_at":"2026-04-04T06:41:01.440Z","repository":{"id":51027748,"uuid":"470832353","full_name":"minimaxir/imgbeddings","owner":"minimaxir","description":"Python package to generate image embeddings with CLIP without PyTorch/TensorFlow","archived":false,"fork":false,"pushed_at":"2022-03-28T00:59:31.000Z","size":1619,"stargazers_count":147,"open_issues_count":6,"forks_count":14,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-04-02T12:35:59.253Z","etag":null,"topics":["ai","clip","embeddings","image-processing","images","onnx","transformers"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/minimaxir.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null},"funding":{"github":"minimaxir","patreon":"minimaxir","open_collective":null,"ko_fi":null,"tidelift":null,"community_bridge":null,"liberapay":null,"issuehunt":null,"otechie":null,"custom":null}},"created_at":"2022-03-17T03:15:50.000Z","updated_at":"2025-02-13T20:20:55.000Z","dependencies_parsed_at":"2022-09-16T03:24:09.248Z","dependency_job_id":null,"html_url":"https://github.com/minimaxir/imgbeddings","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/minimaxir%2Fimgbeddings","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/minimaxir%2Fimgbeddings/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/minimaxir%2Fimgbeddings/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/minimaxir%2Fimgbeddings/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/minimaxir","download_url":"https://codeload.github.com/minimaxir/imgbeddings/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248075695,"owners_count":21043634,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","clip","embeddings","image-processing","images","onnx","transformers"],"created_at":"2024-08-03T03:02:05.566Z","updated_at":"2026-04-04T06:41:01.412Z","avatar_url":"https://github.com/minimaxir.png","language":"Python","funding_links":["https://github.com/sponsors/minimaxir","https://patreon.com/minimaxir","https://www.patreon.com/minimaxir"],"categories":["Python"],"sub_categories":[],"readme":"# imgbeddings\n\nA Python package to generate embedding vectors from images, using [OpenAI](https://openai.com)'s robust [CLIP model](https://github.com/openai/CLIP) via [Hugging Face](https://huggingface.co) [transformers](https://huggingface.co/docs/transformers/index). These image embeddings, derived from an image model that has seen the entire internet up to mid-2020, can be used for many things: unsupervised clustering (e.g. via [umap](https://umap-learn.readthedocs.io/en/latest/)), embeddings search (e.g. via [faiss](https://github.com/facebookresearch/faiss)), and using downstream for other framework-agnostic ML/AI tasks such as building a classifier or calculating image similarity.\n\n- The [embeddings generation models](https://huggingface.co/minimaxir/imgbeddings) are ONNX INT8-quantized, meaning they're 20-30% faster on the CPU, much smaller on disk, and doesn't require PyTorch or TensorFlow as a dependency!\n- Works for many different image domains thanks to CLIP's zero-shot performance.\n- Includes utilities for using [principal component analysis (PCA)](https://en.wikipedia.org/wiki/Principal_component_analysis) to reduces the dimensionality of generated embeddings without losing much info.\n\n## Real-World Demo Notebooks\n\n![](docs/umap.png)\n\nYou can read how to use imgbeddings for real-world use cases in these Jupyter Notebooks:\n\n- [Cats vs. Dogs](examples/cats_dogs.ipynb): image clustering and building a cat/dog classifier\n- [Pokémon](examples/pokemon.ipynb): most-similar image search\n- [Image Augmentation](examples/augmentation.ipynb): generated embedding resilience to altered inputs\n\n## Installation\n\naitextgen can be installed from PyPI:\n\n```sh\npip3 install imgbeddings\n```\n\n## Quick Example\n\nLet's say you want to generate an image embedding for a [cute cat photo](http://images.cocodataset.org/val2017/000000039769.jpg). First you can download the photo:\n\n```py3\nimport requests\nfrom PIL import Image\nurl = \"http://images.cocodataset.org/val2017/000000039769.jpg\"\nimage = Image.open(requests.get(url, stream=True).raw)\n```\n\nThen, you can load imgbeddings. By default, imgbeddings will load a 88MB model based on the patch32 variant of CLIP, which separates each image into 49 32x32 patches.\n\n```py3\nfrom imgbeddings import imgbeddings\nibed = imgbeddings()\n```\n\nYou can also load the patch16 model by passing `patch_size = 16` to `imgbeddings()` (more granular embeddings but takes about 3x longer to run), or the \"large\" patch14 model with `patch_size = 14` (3.5x model size, 3x longer than patch16).\n\nThen to generate embeddings, all you have to is pass the image to `to_embeddings()`!\n\n```py3\nembedding = ibed.to_embeddings(image)\nembedding[0][0:5] # array([ 0.914541, 0.45988417, 0.0350069 , -0.9054574 , 0.08941309], dtype=float32)\n```\n\nThis returns a 768D [numpy](https://numpy.org) vector for each input, which can be used for pretty much anything in the ML/AI world. You can also pass a list of filename and/or [PIL](https://pillow.readthedocs.io/en/stable/index.html) Images for batch embeddings generation.\n\nSee the Demo Notebooks above for more advanced parameters and real-world use cases. More formal documentation will be added soon.\n\n## Ethics\n\nThe [official paper for CLIP](https://openai.com/blog/clip/) explicitly notes that there are inherent biases in the finished model, and that CLIP shouldn't be used in production applications as a result. My perspective is that having better tools free-and-open-source to _detect_ such issues and make it more transparent is an overall good for the future of AI, especially since there are less-public ways to create image embeddings that aren't as accessible. At the least, this package doesn't do anything that wasn't already available when CLIP was open-sourced in January 2021.\n\nIf you do use imgbeddings for your own project, I recommend doing a strong QA pass along a diverse set of inputs for your application, which is something you should always be doing whenever you work with machine learning, biased models or not.\n\nimgbeddings is not responsible for malicious misuse of image embeddings.\n\n## Design Notes\n\n- Note that CLIP was trained on square images only, and imgbeddings will pad and resize rectangular images into a square (imgbeddings deliberately does not center crop). As a result, images too wide/tall (e.g. more than a 3:1 ratio of largest dimension to smallest) will not generate robust embeddings.\n- This package only works with image data intentionally as opposed to leveraging CLIP's ability to link image and text. For downstream tasks, using your own text in conjunction with an image will likely give better results. (e.g. if training a model on an image embeddings + text embeddings, feed both and let the model determine the relative importance of each for your use case)\n\nFor more miscellaneous design notes, see [DESIGN.md](DESIGN.md).\n\n## Maintainer/Creator\n\nMax Woolf ([@minimaxir](https://minimaxir.com))\n\n_Max's open-source projects are supported by his [Patreon](https://www.patreon.com/minimaxir) and [GitHub Sponsors](https://github.com/sponsors/minimaxir). If you found this project helpful, any monetary contributions to the Patreon are appreciated and will be put to good creative use._\n\n## See Also\n\n- [Sentence Transformers](https://sbert.net/index.html), which has a [wrapper around CLIP](https://sbert.net/examples/applications/image-search/README.html) that supports Image-to-Image search.\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fminimaxir%2Fimgbeddings","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fminimaxir%2Fimgbeddings","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fminimaxir%2Fimgbeddings/lists"}