{"id":21054537,"url":"https://github.com/johnowhitaker/imstack","last_synced_at":"2025-05-15T22:34:19.696Z","repository":{"id":53120751,"uuid":"474269231","full_name":"johnowhitaker/imstack","owner":"johnowhitaker","description":"Optimizable stack of images at different resolutions, a useful representation of images for deep learning tasks. Docs: https://johnowhitaker.github.io/imstack/","archived":false,"fork":false,"pushed_at":"2022-09-08T02:17:42.000Z","size":21200,"stargazers_count":11,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-04-18T12:35:10.569Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/johnowhitaker.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-03-26T07:01:09.000Z","updated_at":"2023-06-13T17:21:47.000Z","dependencies_parsed_at":"2022-08-29T23:40:30.966Z","dependency_job_id":null,"html_url":"https://github.com/johnowhitaker/imstack","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":"fastai/nbdev_template","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/johnowhitaker%2Fimstack","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/johnowhitaker%2Fimstack/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/johnowhitaker%2Fimstack/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/johnowhitaker%2Fimstack/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/johnowhitaker","download_url":"https://codeload.github.com/johnowhitaker/imstack/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254433551,"owners_count":22070492,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-19T16:14:41.453Z","updated_at":"2025-05-15T22:34:14.683Z","avatar_url":"https://github.com/johnowhitaker.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"ImStack\n================\n\n\u003c!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! --\u003e\n\nOptimizing the pixel values of an image to minimize some loss is common\nin some applications like style transfer. But because a change to any\none pixel doesn’t affect much of the image, results are often noisy and\nslow. By representing an image as a stack of layers at different\nresolutions, we get parameters that affect a large part of the image\n(low-res layers) as well as some that can encode fine detail (the\nhigh-res layers). There are better ways to do this, but I found myself\nusing this approach enough that I decided to turn it into a proper\nlibrary.\n\nHere’s a [colab\nnotebook](https://colab.research.google.com/drive/10gSIlqRGom18kl8NZSytyWYciej8H46N?usp=sharing)\nshowing this in action, generating images to match a CLIP prompt.\n\n## Install\n\nThis package is available on pypi so install should be as easy as:\n\n`pip install imstack`\n\n## How to use\n\nWe create a new image stack like so:\n\n``` python\nims = ImStack(n_layers=3)\n```\n\nBy default, the first layer is 32x32 pixels and each subsequent layer is\n2x larger. We can visualize the layers with:\n\n``` python\nims.plot_layers()\n```\n\n![](index_files/figure-gfm/cell-3-output-1.png)\n\nThe parameters (pixels) of the layers are set to requires_grad=True, so\nyou can pass the layers to an optimizer with something like\n`optimizer = optim.Adam(ims.layers, lr=0.1, weight_decay=1e-4)` to\nmodify them based on some loss. Calling the forward pass\n(`image = ims()`) returns a tensor representation of the combined image,\nsuitable for various pytorch operations.\n\nFor convenience, you can also get a PIL Image for easy viewing with:\n\n``` python\nims.to_pil()\n```\n\n![](index_files/figure-gfm/cell-4-output-1.png)\n\n### Loading images into an ImStack\n\nYou don’t need to start from scratch - pass in a PIL image or a filename\nand the ImStack will be initialized such that the layers combine to\nre-create the input image as closely as possible.\n\n``` python\nfrom PIL import Image\n\n# Load the input image\ninput_image = Image.open('demo_image.png')\ninput_image\n```\n\n![](index_files/figure-gfm/cell-5-output-1.png)\n\nNote how the lower layers capture broad shapes while the final layer is\nmostly fine detail.\n\n``` python\n# Create an image stack with init_image=input_image and plot the layers\nims_w_init = ImStack(n_layers=3, base_size=16, scale=4, out_size=256, init_image=input_image)\nims_w_init.plot_layers()\n```\n\n![](index_files/figure-gfm/cell-6-output-1.png)\n\n# Examples\n\n### Text-to-image with ImStack+CLIP\n\nVery fast text-to-image, using CLIP to calculate a loss that measures\nhow well the image matches a text prompt. In this example, the prompt\nwas ‘A watercolor painting of an underwater submarine’:\n\n``` python\nImage.open('clip_eg.png')\n```\n\n![](index_files/figure-gfm/cell-7-output-1.png)\n\n[colab\nlink](https://colab.research.google.com/drive/10gSIlqRGom18kl8NZSytyWYciej8H46N?usp=sharing)\n\n[and a CLOOB\nversion](https://colab.research.google.com/drive/1PAPb2PiGHxnPwF2JaYKFnE063vXJPRfu?usp=sharing)\n\n### Style Transfer\n\nSimple style transfer, with an ImStack being optimized such that content\nloss to one image and style loss to another are minimized.\n\n``` python\nImage.open('style_tf_eg.png')\n```\n\n![](index_files/figure-gfm/cell-8-output-1.png)\n\n[colab\nlink](https://colab.research.google.com/drive/1Zh3OxXE0OWqwzrAhvUBX2VtRBgz87ahQ?usp=sharing)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjohnowhitaker%2Fimstack","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjohnowhitaker%2Fimstack","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjohnowhitaker%2Fimstack/lists"}