{"id":13737019,"url":"https://github.com/rom1504/image_embeddings","last_synced_at":"2025-04-07T19:14:33.898Z","repository":{"id":44861499,"uuid":"269773439","full_name":"rom1504/image_embeddings","owner":"rom1504","description":"Using efficientnet to provide embeddings for retrieval","archived":false,"fork":false,"pushed_at":"2023-05-28T11:15:14.000Z","size":16944,"stargazers_count":157,"open_issues_count":18,"forks_count":31,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-03-31T18:18:46.343Z","etag":null,"topics":["computer-vision","efficientnet","embeddings","recommendation-system","representation-learning","retrieval"],"latest_commit_sha":null,"homepage":"https://rom1504.github.io/image_embeddings/","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rom1504.png","metadata":{"files":{"readme":"README.md","changelog":"HISTORY.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2020-06-05T20:51:55.000Z","updated_at":"2025-03-17T23:43:26.000Z","dependencies_parsed_at":"2024-01-11T13:19:43.158Z","dependency_job_id":null,"html_url":"https://github.com/rom1504/image_embeddings","commit_stats":null,"previous_names":[],"tags_count":10,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rom1504%2Fimage_embeddings","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rom1504%2Fimage_embeddings/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rom1504%2Fimage_embeddings/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rom1504%2Fimage_embeddings/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rom1504","download_url":"https://codeload.github.com/rom1504/image_embeddings/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247713258,"owners_count":20983683,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-vision","efficientnet","embeddings","recommendation-system","representation-learning","retrieval"],"created_at":"2024-08-03T03:01:33.658Z","updated_at":"2025-04-07T19:14:33.868Z","avatar_url":"https://github.com/rom1504.png","language":"Jupyter Notebook","funding_links":[],"categories":["Jupyter Notebook"],"sub_categories":[],"readme":"# image_embeddings\n[![pypi](https://img.shields.io/pypi/v/image_embeddings.svg)](https://pypi.python.org/pypi/image_embeddings)\n[![ci](https://github.com/rom1504/image_embeddings/workflows/Continuous%20integration/badge.svg)](https://github.com/rom1504/image_embeddings/actions?query=workflow%3A%22Continuous+integration%22)\n\n\nUsing efficientnet to provide embeddings for retrieval. Read the blog post at https://medium.com/@rom1504/image-embeddings-ed1b194d113e\n\nWhy this repo ? Embeddings are a widely used technique that is well known in scientific circles. But it seems to be underused and not very well known for most engineers. I want to show how easy it is to represent things as embeddings, and how many application this unlocks. Checkout the [demo](https://rom1504.github.io/image_embeddings/) first!\n\n![knn example](knn_example.png)\n\n## Workflow\n1. download some pictures\n2. run inference on them to get embeddings\n3. simple knn example, to understand what's the point : click on some pictures and see KNN\n\n## Simple Install\n\nRun `pip install image_embeddings`\n\n## Example workflow\n\n1. run `image_embeddings save_examples_to_folder --images_count=1000 --output_folder=tf_flower_images`, this will retrieve 1000 image files from https://www.tensorflow.org/datasets/catalog/tf_flowers (but you can also pick any other dataset)\n2. produce tf records with `image_embeddings write_tfrecord --image_folder=tf_flower_images --output_folder=tf_flower_tf_records --shards=10`\n3. run the inference with `image_embeddings run_inference --tfrecords_folder=tf_flower_tf_records --output_folder=tf_flower_embeddings`\n4. run a random knn search on them `image_embeddings random_search --path=tf_flower_embeddings`\n\nOptionally if you want to use the embeddings in numpy (in other languages), run `image_embeddings embeddings_to_numpy --input_path=tf_flower_embeddings --output_path=tf_flower_numpy`. In particular this can be used in the [web demo](https://github.com/rom1504/image_embeddings/tree/web)\n\n```\n$ image_embeddings random_search --path=tf_flower_embeddings\nimage_roses_261\n160.83 image_roses_261\n114.36 image_roses_118\n102.77 image_roses_537\n92.95 image_roses_659\n88.49 image_roses_197\n```\n\nExplore the [Simple notebook](notebooks/using_the_lib.ipynb) for more details.\n\nYou can try it locally or [try it in colab](https://colab.research.google.com/github/rom1504/image_embeddings/blob/master/notebooks/using_the_lib.ipynb)\n\nThe [From scratch](notebooks/from_scratch.ipynb) notebook provides an explanation on how to build this from scratch.\n\n## API\n\n### image_embeddings.downloader\n\nDownloader from tensorflow datasets. Any other set of images could be used instead\n\n#### image_embeddings.downloader.save_examples_to_folder(output_folder, images_count=1000, dataset=\"tf_flowers\")\n\nSave https://www.tensorflow.org/datasets/catalog/tf_flowers to folder\nAlso works with other tf datasets\n\n### image_embeddings.inference\n\nCreate tf recors from images files, and apply inference with an efficientnet model. Other models could be used.\n\n#### image_embeddings.inference.write_tfrecord(image_folder, output_folder, num_shards=100)\n\nWrite tf records from an image folders\n\n#### image_embeddings.inference.run_inference(tfrecords_folder, output_folder, batch_size=1000)\n\nRun inference on provided tf records and save to folder the embeddings\n\n### image_embeddings.knn\n\nConvenience methods to read, build indices and apply search on them. These methods are provided as example.\nUse [faiss](https://github.com/facebookresearch/faiss) directly for bigger datasets.\n\n#### image_embeddings.knn.read_embeddings(path)\n\nRun embeddings from path and return a tuple with \n* embeddings as a numpy matrix\n* an id to name dictionary\n* a name to id dictionary\n\n#### image_embeddings.knn.build_index(emb)\n\nBuild a simple faiss inner product index using the provided matrix of embeddings\n\n#### image_embeddings.knn.search(index, id_to_name, emb, k=5)\n\nSearch the query embeddings and return an array of (distance, name) images\n\n#### image_embeddings.knn.display_picture(image_path, image_name)\n\nDisplay one picture from the given path and image name in jupyter\n\n#### image_embeddings.knn.display_results(image_path, results)\n\nDisplay the results from search method\n\n#### image_embeddings.knn.random_search(path)\n\nLoad the embeddings, apply a random search on them and display the result\n\n#### image_embeddings.knn.embeddings_to_numpy(input_path, output_folder)\n\nLoad the embeddings from the input folder as parquet and save them as\n* json for the id -\u003e name mapping\n* numpy for the embeddings\n\nParticularly useful to read the embeddings from other languages\n\n## Advanced Installation\n\n### Prerequisites\n\nMake sure you use `python\u003e=3.6` and an up-to-date version of `pip` and\n`setuptools`\n\n    python --version\n    pip install -U pip setuptools\n\nIt is recommended to install `image_embeddings` in a new virtual environment. For\nexample\n\n    python3 -m venv image_embeddings_env\n    source image_embeddings_env/bin/activate\n    pip install -U pip setuptools\n    pip install image_embeddings\n\n### Using Pip\n\n    pip install image_embeddings\n\n### From Source\n\nFirst, clone the `image_embeddings` repo on your local machine with\n\n    git clone https://github.com/rom1504/image_embeddings.git\n    cd image_embeddings\n    make install\n\nTo install development tools and test requirements, run\n\n    make install-dev\n\n## Test\n\nTo run unit tests in your current environment, run\n\n    make test\n\nTo run lint + unit tests in a fresh virtual environment,\nrun\n\n    make venv-lint-test\n\n## Lint\n\nTo run `black --check`:\n\n    make lint\n\nTo auto-format the code using `black`\n\n    make black\n\n## Tasks\n\n* [x] simple downloader in python\n* [x] simple inference in python using https://github.com/qubvel/efficientnet\n* [x] build python basic knn example using https://github.com/facebookresearch/faiss\n* [x] build basic ui using lit element and some brute force knn to show what it does, put in github pages\n* [x] use to illustrate embeddings blogpost\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2From1504%2Fimage_embeddings","html_url":"https://awesome.ecosyste.ms/projects/github.com%2From1504%2Fimage_embeddings","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2From1504%2Fimage_embeddings/lists"}