{"id":15646262,"url":"https://github.com/sayakpaul/learnable-image-resizing","last_synced_at":"2025-06-17T10:43:49.872Z","repository":{"id":106648952,"uuid":"363047929","full_name":"sayakpaul/Learnable-Image-Resizing","owner":"sayakpaul","description":"TF 2 implementation Learning to Resize Images for Computer Vision Tasks (https://arxiv.org/abs/2103.09950v1).","archived":false,"fork":false,"pushed_at":"2021-10-12T01:44:12.000Z","size":3364,"stargazers_count":52,"open_issues_count":0,"forks_count":8,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-01-09T21:46:30.517Z","etag":null,"topics":["image-recognition","keras","learnable-resizing","tensorflow","vision"],"latest_commit_sha":null,"homepage":"https://keras.io/examples/vision/learnable_resizer/","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sayakpaul.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-04-30T06:29:05.000Z","updated_at":"2024-11-15T03:27:55.000Z","dependencies_parsed_at":null,"dependency_job_id":"a5c57e40-544c-43c4-8a07-601401a80a2b","html_url":"https://github.com/sayakpaul/Learnable-Image-Resizing","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sayakpaul%2FLearnable-Image-Resizing","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sayakpaul%2FLearnable-Image-Resizing/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sayakpaul%2FLearnable-Image-Resizing/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sayakpaul%2FLearnable-Image-Resizing/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sayakpaul","download_url":"https://codeload.github.com/sayakpaul/Learnable-Image-Resizing/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":233687590,"owners_count":18714346,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["image-recognition","keras","learnable-resizing","tensorflow","vision"],"created_at":"2024-10-03T12:12:07.007Z","updated_at":"2025-01-13T03:27:11.835Z","avatar_url":"https://github.com/sayakpaul.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Learnable-Image-Resizing\nTensorFlow 2 implementation of [Learning to Resize Images for Computer Vision Tasks](https://arxiv.org/abs/2103.09950v1) by Talebi et al.\n\nAccompanying blog post on keras.io: [Learning to Resize in Computer Vision](https://keras.io/examples/vision/learnable_resizer/).\n\nThe above-mentioned paper proposes a simple framework to optimally learning representations for a given network architecture and given image resolution (such as 224x224). The authors find that the representations that are more coherent with the human perception system _may not always_ improve the performance of vision models. Instead, optimizing the representations that are better suited for the models can substantially improve their performance. \n\nThe diagram presents the proposed learnable resizer module (source: original paper):\n\n\u003cdiv align=\"center\"\u003e\n\u003cimg src=\"https://i.ibb.co/gJYtSs0/image.png\" width=\"750\"\u003e\u003c/img\u003e\n\u003c/div\u003e\n\u003cbr\u003e\n\nHere's how the resized images look like after being passed through a learned resizer:\n\n\u003cdiv align=\"center\"\u003e\n\n![](figures/visualization.png)\n\n\u003c/div\u003e\n\nOn the left hand side, we see the outputs of an untrained learnable resizer. On the right, the outputs are from the same learnable resizer but with **10 epochs of training**. The images may not make sense to our eyes in terms of their perceptual quality, but they help to improve the recognition performance of the vision models.\n\n## About the notebooks\n* `Standard_Training.ipynb`: Shows how to train a DenseNet-121 on the Cats and Dogs dataset with bilinear resizing (150 x 150).\n* `Learnable_Resizer.ipynb`: Shows how to train the same network with the learnable resizing module included. Here, the inputs are first resized to 300 x 300 and then the learnable resizer module helps learn optimal representations for 150 x 150. \n\nThese incorporate mixed-precision training along with distributed training. \n\n## Results\n|           Model           \t| Number of  parameters (Million) \t| Top-1 accuracy \t|\n|:-------------------------:\t|:-------------------------------:\t|:--------------:\t|\n|   With learnable resizer  \t|             7.051717            \t|      67.67%     \t|\n| Without learnable resizer \t|             7.039554            \t|      60.19%      \t|\n\nBoth the models were trained for only 10 epochs from the same initial checkpoint.\n\nYou can reproduce these results with the model weights provided [here](https://github.com/sayakpaul/Learnable-Image-Resizing/releases/tag/v1.0.0).\n\n## Paper citation\n\n```\n@InProceedings{Talebi_2021_ICCV,\n    author    = {Talebi, Hossein and Milanfar, Peyman},\n    title     = {Learning To Resize Images for Computer Vision Tasks},\n    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},\n    month     = {October},\n    year      = {2021},\n    pages     = {497-506}\n}\n```\n\n## Acknowledgements\n* [ML-GDE program](https://developers.google.com/programs/experts/) for providing GCP credit support. \n* Mark Doust (of Google) for feedback. \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsayakpaul%2Flearnable-image-resizing","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsayakpaul%2Flearnable-image-resizing","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsayakpaul%2Flearnable-image-resizing/lists"}