{"id":15028254,"url":"https://github.com/janspiry/image-super-resolution-via-iterative-refinement","last_synced_at":"2025-05-14T21:06:16.308Z","repository":{"id":37341775,"uuid":"390327676","full_name":"Janspiry/Image-Super-Resolution-via-Iterative-Refinement","owner":"Janspiry","description":"Unofficial implementation of Image Super-Resolution via Iterative Refinement by Pytorch","archived":false,"fork":false,"pushed_at":"2023-11-04T00:38:05.000Z","size":10234,"stargazers_count":3764,"open_issues_count":54,"forks_count":480,"subscribers_count":65,"default_branch":"master","last_synced_at":"2025-05-14T21:06:10.067Z","etag":null,"topics":["ddpm","diffusion-probabilistic","image-generation","pytorch","super-resolution"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Janspiry.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-07-28T11:32:06.000Z","updated_at":"2025-05-14T07:36:20.000Z","dependencies_parsed_at":"2023-01-30T18:45:59.508Z","dependency_job_id":"f2136e67-1fba-487a-bd65-94ebee1158a6","html_url":"https://github.com/Janspiry/Image-Super-Resolution-via-Iterative-Refinement","commit_stats":{"total_commits":79,"total_committers":9,"mean_commits":8.777777777777779,"dds":0.430379746835443,"last_synced_commit":"01d27a7cbfa8502be1d8dbd4ee02fcbd5e44389d"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Janspiry%2FImage-Super-Resolution-via-Iterative-Refinement","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Janspiry%2FImage-Super-Resolution-via-Iterative-Refinement/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Janspiry%2FImage-Super-Resolution-via-Iterative-Refinement/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Janspiry%2FImage-Super-Resolution-via-Iterative-Refinement/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Janspiry","download_url":"https://codeload.github.com/Janspiry/Image-Super-Resolution-via-Iterative-Refinement/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254227611,"owners_count":22035669,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ddpm","diffusion-probabilistic","image-generation","pytorch","super-resolution"],"created_at":"2024-09-24T20:07:54.277Z","updated_at":"2025-05-14T21:06:11.279Z","avatar_url":"https://github.com/Janspiry.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Image Super-Resolution via Iterative Refinement\n\n[Paper](https://arxiv.org/pdf/2104.07636.pdf ) |  [Project](https://iterative-refinement.github.io/ )\n\n## Brief\n\nThis is an unofficial implementation of **Image Super-Resolution via Iterative Refinement(SR3)** by **PyTorch**.\n\nThere are some implementation details that may vary from the paper's description, which may be different from the actual `SR3` structure due to details missing. Specifically, we:\n\n- Used the ResNet block and channel concatenation style like vanilla `DDPM`.\n- Used the attention mechanism in low-resolution features ( $16 \\times 16$ ) like vanilla `DDPM`.\n- Encode the $\\gamma$ as `FilM` structure did in `WaveGrad`, and embed it without affine transformation.\n- Define the posterior variance as $\\dfrac{1-\\gamma_{t-1}}{1-\\gamma_{t}} \\beta_t$  rather than $\\beta_t$,  which gives similar results to the vanilla paper.\n\n**If you just want to upscale $(64 \\times 64)\\text{px} \\rightarrow (512 \\times 512)\\text{px}$ images using the pre-trained model, check out [this google colab script](https://colab.research.google.com/drive/1G1txPI1GKueKH0cSi_DgQFKwfyJOXlhY?usp=sharing).**\n\n## Status\n\n**★★★ NEW: The follow-up [Palette-Image-to-Image-Diffusion-Models](https://arxiv.org/abs/2111.05826) is now available; See the details [here](https://github.com/Janspiry/Palette-Image-to-Image-Diffusion-Models) ★★★**\n\n### Conditional Generation (with Super Resolution)\n\n- [x] 16×16 -\u003e 128×128 on FFHQ-CelebaHQ\n- [x] 64×64 -\u003e 512×512 on FFHQ-CelebaHQ\n\n### Unconditional Generation\n\n- [x] 128×128 face generation on FFHQ\n- [ ] ~~1024×1024 face generation by a cascade of 3 models~~\n\n### Training Step\n\n- [x] log / logger\n- [x] metrics evaluation\n- [x] multi-gpu support\n- [x] resume training / pretrained model\n- [x] validate alone script\n- [x] [Weights and Biases Logging](https://github.com/Janspiry/Image-Super-Resolution-via-Iterative-Refinement/pull/44) 🌟 NEW\n\n\n\n## Results\n\n*Note:*  We set the maximum reverse steps budget to $2000$. We limited the model parameters in `Nvidia 1080Ti`, **image noise** and **hue deviation** occasionally appear in high-resolution images, resulting in low scores.  There is a lot of room for optimization.  **We are welcome to any contributions for more extensive experiments and code enhancements.**\n\n| Tasks/Metrics        | SSIM(+) | PSNR(+) | FID(-)  | IS(+)   |\n| -------------------- | ----------- | -------- | ---- | ---- |\n| 16×16 -\u003e 128×128 | 0.675       | 23.26    | - | - |\n| 64×64 -\u003e 512×512     | 0.445 | 19.87 | - | - |\n| 128×128 | - | - | | |\n| 1024×1024 | - | - |      |      |\n\n- #### 16×16 -\u003e 128×128 on FFHQ-CelebaHQ [[More Results](https://drive.google.com/drive/folders/1Vk1lpHzbDf03nME5fV9a-lWzSh3kMK14?usp=sharing)]\n\n| \u003cimg src=\"./misc/sr_process_16_128_0.png\" alt=\"show\" style=\"zoom:90%;\" /\u003e |  \u003cimg src=\"./misc/sr_process_16_128_1.png\" alt=\"show\" style=\"zoom:90%;\" /\u003e    |   \u003cimg src=\"./misc/sr_process_16_128_2.png\" alt=\"show\" style=\"zoom:90%;\" /\u003e   |\n| ------------------------------------------------------------ | ---- | ---- |\n\n- #### 64×64 -\u003e 512×512 on FFHQ-CelebaHQ [[More Results](https://drive.google.com/drive/folders/1yp_4xChPSZUeVIgxbZM-e3ZSsSgnaR9Z?usp=sharing)]\n\n| \u003cimg src=\"./misc/sr_64_512_0_inf.png\" alt=\"show\" style=\"zoom:90%;\" /\u003e | \u003cimg src=\"./misc/sr_64_512_0_sr.png\" alt=\"show\" style=\"zoom:90%;\" /\u003e | \u003cimg src=\"./misc/sr_64_512_0_hr.png\" alt=\"show\" style=\"zoom:90%;\" /\u003e |\n| ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ |\n| \u003cimg src=\"./misc/sr_64_512_1_sr.png\" alt=\"show\" style=\"zoom:90%;\" /\u003e | \u003cimg src=\"./misc/sr_64_512_2_sr.png\" alt=\"show\" style=\"zoom:90%;\" /\u003e | \u003cimg src=\"./misc/sr_64_512_3_sr.png\" alt=\"show\" style=\"zoom:90%;\" /\u003e |\n\n- #### 128×128 face generation on FFHQ [[More Results](https://drive.google.com/drive/folders/13AsjRwDw4wMmL0bK7wPd2rP7ds7eyAMh?usp=sharing)]\n\n| \u003cimg src=\"./misc/sample_process_128_0.png\" alt=\"show\" style=\"zoom:90%;\" /\u003e |  \u003cimg src=\"./misc/sample_process_128_1.png\" alt=\"show\" style=\"zoom:90%;\" /\u003e    |   \u003cimg src=\"./misc/sample_process_128_2.png\" alt=\"show\" style=\"zoom:90%;\" /\u003e   |\n| ------------------------------------------------------------ | ---- | ---- |\n\n\n\n## Usage\n### Environment\n```python\npip install -r requirement.txt\n```\n\n### Pretrained Model\n\nThis paper is based on \"Denoising Diffusion Probabilistic Models\", and we build both DDPM/SR3 network structures, which use timesteps/gamma as model embedding inputs, respectively. In our experiments, the SR3 model can achieve better visual results with the same reverse steps and learning rate. You can select the JSON files with annotated suffix names to train the different models.\n\n| Tasks                             | Platform（Code：qwer)                                        | \n| --------------------------------- | ------------------------------------------------------------ |\n| 16×16 -\u003e 128×128 on FFHQ-CelebaHQ | [Google Drive](https://drive.google.com/drive/folders/12jh0K8XoM1FqpeByXvugHHAF3oAZ8KRu?usp=sharing)\\|[Baidu Yun](https://pan.baidu.com/s/1OzsGZA2Vmq1ZL_VydTbVTQ) |  \n| 64×64 -\u003e 512×512 on FFHQ-CelebaHQ | [Google Drive](https://drive.google.com/drive/folders/1mCiWhFqHyjt5zE4IdA41fjFwCYdqDzSF?usp=sharing)\\|[Baidu Yun](https://pan.baidu.com/s/1orzFmVDxMmlXQa2Ty9zY3g) |   \n| 128×128 face generation on FFHQ   | [Google Drive](https://drive.google.com/drive/folders/1ldukMgLKAxE7qiKdFJlu-qubGlnW-982?usp=sharing)\\|[Baidu Yun](https://pan.baidu.com/s/1Vsd08P1A-48OGmnRV0E7Fg ) | \n\n```python\n# Download the pretrained model and edit [sr|sample]_[ddpm|sr3]_[resolution option].json about \"resume_state\":\n\"resume_state\": [your pretrained model's path]\n```\n\n### Data Prepare\n\n#### New Start\n\nIf you didn't have the data, you can prepare it by following steps:\n\n- [FFHQ 128×128](https://github.com/NVlabs/ffhq-dataset) | [FFHQ 512×512](https://www.kaggle.com/arnaud58/flickrfaceshq-dataset-ffhq)\n- [CelebaHQ 256×256](https://www.kaggle.com/badasstechie/celebahq-resized-256x256) | [CelebaMask-HQ 1024×1024](https://drive.google.com/file/d/1badu11NqxGf6qM3PTTooQDJvQbejgbTv/view)\n\nDownload the dataset and prepare it in **LMDB** or **PNG** format using script.\n\n```python\n# Resize to get 16×16 LR_IMGS and 128×128 HR_IMGS, then prepare 128×128 Fake SR_IMGS by bicubic interpolation\npython data/prepare_data.py  --path [dataset root]  --out [output root] --size 16,128 -l\n```\n\nthen you need to change the datasets config to your data path and image resolution: \n\n```json\n\"datasets\": {\n    \"train\": {\n        \"dataroot\": \"dataset/ffhq_16_128\", // [output root] in prepare.py script\n        \"l_resolution\": 16, // low resolution need to super_resolution\n        \"r_resolution\": 128, // high resolution\n        \"datatype\": \"lmdb\", //lmdb or img, path of img files\n    },\n    \"val\": {\n        \"dataroot\": \"dataset/celebahq_16_128\", // [output root] in prepare.py script\n    }\n},\n```\n\n#### Own Data\n\nYou also can use your image data by following steps, and we have some examples in dataset folder.\n\nAt first, you should organize the images layout like this, this step can be finished by `data/prepare_data.py` automatically:\n\n```shell\n# set the high/low resolution images, bicubic interpolation images path \ndataset/celebahq_16_128/\n├── hr_128 # it's same with sr_16_128 directory if you don't have ground-truth images.\n├── lr_16 # vinilla low resolution images\n└── sr_16_128 # images ready to super resolution\n```\n\n```python\n# super resolution from 16 to 128\npython data/prepare_data.py  --path [dataset root]  --out celebahq --size 16,128 -l\n```\n\n*Note: Above script can be used whether you have the vanilla high-resolution images or not.*\n\nthen you need to change the dataset config to your data path and image resolution: \n\n```json\n\"datasets\": {\n    \"train|val\": { // train and validation part\n        \"dataroot\": \"dataset/celebahq_16_128\",\n        \"l_resolution\": 16, // low resolution need to super_resolution\n        \"r_resolution\": 128, // high resolution\n        \"datatype\": \"img\", //lmdb or img, path of img files\n    }\n},\n```\n\n### Training/Resume Training\n\n```python\n# Use sr.py and sample.py to train the super resolution task and unconditional generation task, respectively.\n# Edit json files to adjust network structure and hyperparameters\npython sr.py -p train -c config/sr_sr3.json\n```\n\n### Test/Evaluation\n\n```python\n# Edit json to add pretrain model path and run the evaluation \npython sr.py -p val -c config/sr_sr3.json\n\n# Quantitative evaluation alone using SSIM/PSNR metrics on given result root\npython eval.py -p [result root]\n```\n\n### Inference Alone\n\nSet the  image path like steps in `Own Data`, then run the script:\n\n```python\n# run the script\npython infer.py -c [config file]\n```\n\n## Weights and Biases 🎉\n\nThe library now supports experiment tracking, model checkpointing and model prediction visualization with [Weights and Biases](https://wandb.ai/site). You will need to [install W\u0026B](https://pypi.org/project/wandb/) and login by using your [access token](https://wandb.ai/authorize). \n\n```\npip install wandb\n\n# get your access token from wandb.ai/authorize\nwandb login\n```\n\nW\u0026B logging functionality is added to the `sr.py`, `sample.py` and `infer.py` files. You can pass `-enable_wandb` to start logging.\n\n- `-log_wandb_ckpt`: Pass this argument along with `-enable_wandb` to save model checkpoints as [W\u0026B Artifacts](https://docs.wandb.ai/guides/artifacts). Both `sr.py` and `sample.py` is enabled with model checkpointing. \n- `-log_eval`: Pass this argument along with `-enable_wandb` to save the evaluation result as interactive [W\u0026B Tables](https://docs.wandb.ai/guides/data-vis). Note that only `sr.py` is enabled with this feature. If you run `sample.py` in eval mode, the generated images will automatically be logged as image media panel. \n- `-log_infer`: While running `infer.py` pass this argument along with `-enable_wandb` to log the inference results as interactive W\u0026B Tables. \n\nYou can find more on using these features [here](https://github.com/Janspiry/Image-Super-Resolution-via-Iterative-Refinement/pull/44). 🚀\n\n\n## Acknowledgements\n\nOur work is based on the following theoretical works:\n\n- [Denoising Diffusion Probabilistic Models](https://arxiv.org/pdf/2006.11239.pdf)\n- [Image Super-Resolution via Iterative Refinement](https://arxiv.org/pdf/2104.07636.pdf)\n- [WaveGrad: Estimating Gradients for Waveform Generation](https://arxiv.org/abs/2009.00713)\n- [Large Scale GAN Training for High Fidelity Natural Image Synthesis](https://arxiv.org/abs/1809.11096)\n\nFurthermore, we are benefitting a lot from the following projects:\n\n- https://github.com/bhushan23/BIG-GAN\n- https://github.com/lmnt-com/wavegrad\n- https://github.com/rosinality/denoising-diffusion-pytorch\n- https://github.com/lucidrains/denoising-diffusion-pytorch\n- https://github.com/hejingwenhejingwen/AdaFM\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjanspiry%2Fimage-super-resolution-via-iterative-refinement","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjanspiry%2Fimage-super-resolution-via-iterative-refinement","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjanspiry%2Fimage-super-resolution-via-iterative-refinement/lists"}