{"id":15030078,"url":"https://github.com/iceclear/stablesr","last_synced_at":"2025-05-15T08:12:02.122Z","repository":{"id":163589850,"uuid":"622599959","full_name":"IceClear/StableSR","owner":"IceClear","description":"[IJCV2024] Exploiting Diffusion Prior for Real-World Image Super-Resolution","archived":false,"fork":false,"pushed_at":"2024-07-12T03:13:49.000Z","size":11251,"stargazers_count":2429,"open_issues_count":90,"forks_count":154,"subscribers_count":26,"default_branch":"main","last_synced_at":"2025-05-15T08:11:53.907Z","etag":null,"topics":["stable-diffusion","stablesr","super-resolution"],"latest_commit_sha":null,"homepage":"https://iceclear.github.io/projects/stablesr/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/IceClear.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-04-02T15:34:55.000Z","updated_at":"2025-05-14T08:34:49.000Z","dependencies_parsed_at":"2023-12-20T10:14:28.495Z","dependency_job_id":"c2914b8f-e09a-448d-b311-5274a12b59e5","html_url":"https://github.com/IceClear/StableSR","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IceClear%2FStableSR","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IceClear%2FStableSR/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IceClear%2FStableSR/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/IceClear%2FStableSR/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/IceClear","download_url":"https://codeload.github.com/IceClear/StableSR/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254301612,"owners_count":22047905,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["stable-diffusion","stablesr","super-resolution"],"created_at":"2024-09-24T20:12:23.336Z","updated_at":"2025-05-15T08:11:57.110Z","avatar_url":"https://github.com/IceClear.png","language":"Python","readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://user-images.githubusercontent.com/22350795/236680126-0b1cdd62-d6fc-4620-b998-75ed6c31bf6f.png\" height=40\u003e\n\u003c/p\u003e\n\n## Exploiting Diffusion Prior for Real-World Image Super-Resolution\n\n[Paper](https://arxiv.org/abs/2305.07015) | [Project Page](https://iceclear.github.io/projects/stablesr/) | [Video](https://www.youtube.com/watch?v=5MZy9Uhpkw4) | [WebUI](https://github.com/pkuliyi2015/sd-webui-stablesr) | [ModelScope](https://modelscope.cn/models/xhlin129/cv_stablesr_image-super-resolution/summary) | [ComfyUI](https://github.com/gameltb/comfyui-stablesr)\n\n\n\u003ca href=\"https://colab.research.google.com/drive/11SE2_oDvbYtcuHDbaLAxsKk_o3flsO1T?usp=sharing\"\u003e\u003cimg src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"google colab logo\"\u003e\u003c/a\u003e [![Hugging Face](https://img.shields.io/badge/Demo-%F0%9F%A4%97%20Hugging%20Face-blue)](https://huggingface.co/spaces/Iceclear/StableSR) [![Replicate](https://img.shields.io/badge/Demo-%F0%9F%9A%80%20Replicate-blue)](https://replicate.com/cjwbw/stablesr) [![OpenXLab](https://img.shields.io/badge/Demo-%F0%9F%90%BC%20OpenXLab-blue)](https://openxlab.org.cn/apps/detail/Iceclear/StableSR) ![visitors](https://visitor-badge.laobi.icu/badge?page_id=IceClear/StableSR)\n\n\n[Jianyi Wang](https://iceclear.github.io/), [Zongsheng Yue](https://zsyoaoa.github.io/), [Shangchen Zhou](https://shangchenzhou.com/), [Kelvin C.K. Chan](https://ckkelvinchan.github.io/), [Chen Change Loy](https://www.mmlab-ntu.com/person/ccloy/)\n\nS-Lab, Nanyang Technological University\n\n\u003cimg src=\"assets/network.png\" width=\"800px\"/\u003e\n\n:star: If StableSR is helpful to your images or projects, please help star this repo. Thanks! :hugs:\n\n### Update\n- **2024.06.28**: Accepted by [IJCV](https://link.springer.com/journal/11263). See the latest [Full paper](https://github.com/IceClear/StableSR/releases/download/UncompressedPDF/StableSR_IJCV_Uncompressed.pdf).\n- **2024.02.29**: Support StableSR with [SD-Turbo](https://huggingface.co/stabilityai/sd-turbo). Thank [Andray](https://github.com/light-and-ray) for the finding!\n\n  Now the [ComfyUI](https://github.com/gameltb/comfyui-stablesr) [![GitHub Stars](https://img.shields.io/github/stars/gameltb/comfyui-stablesr?style=social)](https://github.com/gameltb/comfyui-stablesr) of StableSR is also available. Thank [gameltb](https://github.com/gameltb) and [WSJUSA](https://github.com/WSJUSA) for the implementation!\n- **2023.11.30**: Code Update.\n  - Support DDIM and negative prompts\n  - Add CFW training scripts\n  - Add FaceSR training and test scripts\n- **2023.10.08**: Our test sets associated with the results in our [paper](https://arxiv.org/abs/2305.07015) are now available at [[HuggingFace](https://huggingface.co/datasets/Iceclear/StableSR-TestSets)] and [[OpenXLab](https://openxlab.org.cn/datasets/Iceclear/StableSR_Testsets)]. You may have an easy comparison with StableSR now.\n- **2023.08.19**: Integrated to :hugs: [Hugging Face](https://huggingface.co/spaces). Try out online demo! [![Hugging Face](https://img.shields.io/badge/Demo-%F0%9F%A4%97%20Hugging%20Face-blue)](https://huggingface.co/spaces/Iceclear/StableSR).\n- **2023.08.19**: Integrated to :panda_face: [OpenXLab](https://openxlab.org.cn/apps). Try out online demo! [![OpenXLab](https://img.shields.io/badge/Demo-%F0%9F%90%BC%20OpenXLab-blue)](https://openxlab.org.cn/apps/detail/Iceclear/StableSR).\n- **2023.07.31**: Integrated to :rocket: [Replicate](https://replicate.com/explore). Try out online demo! [![Replicate](https://img.shields.io/badge/Demo-%F0%9F%9A%80%20Replicate-blue)](https://replicate.com/cjwbw/stablesr) Thank [Chenxi](https://github.com/chenxwh) for the implementation!\n- **2023.07.16**: You may reproduce the LDM baseline used in our paper using [LDM-SRtuning](https://github.com/IceClear/LDM-SRtuning) [![GitHub Stars](https://img.shields.io/github/stars/IceClear/LDM-SRtuning?style=social)](https://github.com/IceClear/LDM-SRtuning).\n- **2023.07.14**: :whale: [**ModelScope**](https://modelscope.cn/models/xhlin129/cv_stablesr_image-super-resolution/summary) for StableSR is released!\n- **2023.06.30**: :whale: [**New model**](https://huggingface.co/Iceclear/StableSR/blob/main/stablesr_768v_000139.ckpt) trained on [SD-2.1-768v](https://huggingface.co/stabilityai/stable-diffusion-2-1) is released! Better performance with fewer artifacts!\n- **2023.06.28**: Support training on SD-2.1-768v.\n- **2023.05.22**: :whale: Improve the code to save more GPU memory, now 128 --\u003e 512 needs 8.9G. Enable start from intermediate steps.\n- **2023.05.20**: :whale: The [**WebUI**](https://github.com/pkuliyi2015/sd-webui-stablesr) [![GitHub Stars](https://img.shields.io/github/stars/pkuliyi2015/sd-webui-stablesr?style=social)](https://github.com/pkuliyi2015/sd-webui-stablesr) of StableSR is available. Thank [Li Yi](https://github.com/pkuliyi2015) for the implementation!\n- **2023.05.13**: Add Colab demo of StableSR. \u003ca href=\"https://colab.research.google.com/drive/11SE2_oDvbYtcuHDbaLAxsKk_o3flsO1T?usp=sharing\"\u003e\u003cimg src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"google colab logo\"\u003e\u003c/a\u003e\n- **2023.05.11**: Repo is released.\n\n### TODO\n- [x] ~~Code release~~\n- [x] ~~Update link to paper and project page~~\n- [x] ~~Pretrained models~~\n- [x] ~~Colab demo~~\n- [x] ~~StableSR-768v released~~\n- [x] ~~Replicate demo~~\n- [x] ~~HuggingFace demo~~\n- [x] ~~StableSR-face released~~\n- [x] ~~ComfyUI support~~\n\n### Demo on real-world SR\n\n[\u003cimg src=\"assets/imgsli_1.jpg\" height=\"223px\"/\u003e](https://imgsli.com/MTc2MTI2) [\u003cimg src=\"assets/imgsli_2.jpg\" height=\"223px\"/\u003e](https://imgsli.com/MTc2MTE2) [\u003cimg src=\"assets/imgsli_3.jpg\" height=\"223px\"/\u003e](https://imgsli.com/MTc2MTIw)\n[\u003cimg src=\"assets/imgsli_8.jpg\" height=\"223px\"/\u003e](https://imgsli.com/MTc2MjUy) [\u003cimg src=\"assets/imgsli_4.jpg\" height=\"223px\"/\u003e](https://imgsli.com/MTc2MTMy) [\u003cimg src=\"assets/imgsli_5.jpg\" height=\"223px\"/\u003e](https://imgsli.com/MTc2MTMz)\n[\u003cimg src=\"assets/imgsli_9.jpg\" height=\"214px\"/\u003e](https://imgsli.com/MTc2MjQ5) [\u003cimg src=\"assets/imgsli_6.jpg\" height=\"214px\"/\u003e](https://imgsli.com/MTc2MTM0) [\u003cimg src=\"assets/imgsli_7.jpg\" height=\"214px\"/\u003e](https://imgsli.com/MTc2MTM2) [\u003cimg src=\"assets/imgsli_10.jpg\" height=\"214px\"/\u003e](https://imgsli.com/MTc2MjU0)\n\nFor more evaluation, please refer to our [paper](https://arxiv.org/abs/2305.07015) for details.\n\n### Demo on 4K Results\n\n- StableSR is capable of achieving arbitrary upscaling in theory, below is an 4x example with a result beyond 4K (4096x6144).\n\n[\u003cimg src=\"assets/main-fig.png\" width=\"800px\"/\u003e](https://imgsli.com/MjIzMjQx)\n\n```\n# DDIM w/ negative prompts\npython scripts/sr_val_ddim_text_T_negativeprompt_canvas_tile.py --config configs/stableSRNew/v2-finetune_text_T_768v.yaml --ckpt stablesr_768v_000139.ckpt --vqgan_ckpt vqgan_finetune_00011.ckpt --init-img ./inputs/test_example/ --outdir ../output/ --ddim_steps 20 --dec_w 0.0 --colorfix_type wavelet --scale 7.0 --use_negative_prompt --upscale 4 --seed 42 --n_samples 1 --input_size 768 --tile_overlap 48 --ddim_eta 1.0\n```\n\n- **More examples**.\n  - [4K Demo1](https://imgsli.com/MTc4MDg3), which is a 4x SR on the image from [here](https://github.com/pkuliyi2015/multidiffusion-upscaler-for-automatic1111).\n  - [4K Demo2](https://imgsli.com/MTc4NDk2), which is a 8x SR on the image from [here](https://github.com/Mikubill/sd-webui-controlnet/blob/main/tests/images/ski.jpg).\n  - More comparisons can be found [here](https://github.com/IceClear/StableSR/issues/2) and [here](https://github.com/pkuliyi2015/sd-webui-stablesr).\n\n### Dependencies and Installation\n- Pytorch == 1.12.1\n- CUDA == 11.7\n- pytorch-lightning==1.4.2\n- xformers == 0.0.16 (Optional)\n- Other required packages in `environment.yaml`\n```\n# git clone this repository\ngit clone https://github.com/IceClear/StableSR.git\ncd StableSR\n\n# Create a conda environment and activate it\nconda env create --file environment.yaml\nconda activate stablesr\n\n# Install xformers\nconda install xformers -c xformers/label/dev\n\n# Install taming \u0026 clip\npip install -e git+https://github.com/CompVis/taming-transformers.git@master#egg=taming-transformers\npip install -e git+https://github.com/openai/CLIP.git@main#egg=clip\npip install -e .\n```\n\n### Running Examples\n\n#### Train\nDownload the pretrained Stable Diffusion models from [[HuggingFace](https://huggingface.co/stabilityai/stable-diffusion-2-1-base)]\n\n- Train Time-aware encoder with SFT: set the ckpt_path in config files ([Line 22](https://github.com/IceClear/StableSR/blob/main/configs/stableSRNew/v2-finetune_text_T_512.yaml#L22) and [Line 55](https://github.com/IceClear/StableSR/blob/main/configs/stableSRNew/v2-finetune_text_T_512.yaml#L55))\n```\npython main.py --train --base configs/stableSRNew/v2-finetune_text_T_512.yaml --gpus GPU_ID, --name NAME --scale_lr False\n```\n\n- Train CFW: set the ckpt_path in config files ([Line 6](https://github.com/IceClear/StableSR/blob/main/configs/autoencoder/autoencoder_kl_64x64x4_resi.yaml#L6)).\n\nYou need to first generate training data using the finetuned diffusion model in the first stage.\n```\n# General SR\npython scripts/generate_vqgan_data.py --config configs/stableSRdata/test_data.yaml --ckpt CKPT_PATH --outdir OUTDIR --skip_grid --ddpm_steps 200 --base_i 0 --seed 10000\n```\n```\n# For face data\npython scripts/generate_vqgan_data_face.py --config configs/stableSRdata/test_data_face.yaml --ckpt CKPT_PATH --outdir OUTDIR --skip_grid --ddpm_steps 200 --base_i 0 --seed 10000\n```\nThe data folder should be like this:\n```\nCFW_trainingdata/\n    └── inputs\n          └── 00000001.png # LQ images, (512, 512, 3) (resize to 512x512)\n          └── ...\n    └── gts\n          └── 00000001.png # GT images, (512, 512, 3) (512x512)\n          └── ...\n    └── latents\n          └── 00000001.npy # Latent codes (N, 4, 64, 64) of HR images generated by the diffusion U-net, saved in .npy format.\n          └── ...\n    └── samples\n          └── 00000001.png # The HR images generated from latent codes, just to make sure the generated latents are correct.\n          └── ...\n```\n\nThen you can train CFW:\n```\npython main.py --train --base configs/autoencoder/autoencoder_kl_64x64x4_resi.yaml --gpus GPU_ID, --name NAME --scale_lr False\n```\n\n#### Resume\n\n```\npython main.py --train --base configs/stableSRNew/v2-finetune_text_T_512.yaml --gpus GPU_ID, --resume RESUME_PATH --scale_lr False\n```\n\n#### Test directly\n\nDownload the Diffusion and autoencoder pretrained models from [[HuggingFace](https://huggingface.co/Iceclear/StableSR/blob/main/README.md) | [OpenXLab](https://openxlab.org.cn/models/detail/Iceclear/StableSR)].\nWe use the same color correction scheme introduced in paper by default.\nYou may change ```--colorfix_type wavelet``` for better color correction.\nYou may also disable color correction by ```--colorfix_type nofix```\n\n- **StableSR-Turbo**: Get the ckpt first from [[HuggingFace](https://huggingface.co/Iceclear/StableSR/resolve/main/stablesr_turbo.ckpt) or [OpenXLab](https://openxlab.org.cn/models/detail/Iceclear/StableSR/tree/main)]. Then you just need to modify ```--ckpt_path``` and set ```--ddpm_steps``` to 4. See examples below:\n\n```\npython scripts/sr_val_ddpm_text_T_vqganfin_old.py --config configs/stableSRNew/v2-finetune_text_T_512.yaml --ckpt ./stablesr_turbo.ckpt --init-img LQ_PATH --outdir OUT_PATH --ddpm_steps 4 --dec_w 0.5 --seed 42 --n_samples 1 --vqgan_ckpt ./vqgan_cfw_00011.ckpt --colorfix_type wavelet\n```\n\n```\npython scripts/sr_val_ddpm_text_T_vqganfin_oldcanvas_tile.py --config configs/stableSRNew/v2-finetune_text_T_512.yaml --ckpt ./stablesr_turbo.ckpt --init-img LQ_PATH --outdir OUT_PATH --ddpm_steps 4 --dec_w 0.5 --seed 42 --n_samples 1 --vqgan_ckpt ./vqgan_cfw_00011.ckpt --colorfix_type wavelet --upscale 4\n```\n\n- **DDIM is supported now. See [here](https://github.com/IceClear/StableSR/tree/main/scripts)**\n\n- Test on 128 --\u003e 512: You need at least 10G GPU memory to run this script (batchsize 2 by default)\n```\npython scripts/sr_val_ddpm_text_T_vqganfin_old.py --config configs/stableSRNew/v2-finetune_text_T_512.yaml --ckpt CKPT_PATH --vqgan_ckpt VQGANCKPT_PATH --init-img INPUT_PATH --outdir OUT_DIR --ddpm_steps 200 --dec_w 0.5 --colorfix_type adain\n```\n- Test on arbitrary size w/o chop for autoencoder (for results beyond 512): The memory cost depends on your image size, but is usually above 10G.\n```\npython scripts/sr_val_ddpm_text_T_vqganfin_oldcanvas.py --config configs/stableSRNew/v2-finetune_text_T_512.yaml --ckpt CKPT_PATH --vqgan_ckpt VQGANCKPT_PATH --init-img INPUT_PATH --outdir OUT_DIR --ddpm_steps 200 --dec_w 0.5 --colorfix_type adain\n```\n\n- Test on arbitrary size w/ chop for autoencoder: Current default setting needs at least 18G to run, you may reduce the autoencoder tile size by setting ```--vqgantile_size``` and ```--vqgantile_stride```.\nNote the min tile size is 512 and the stride should be smaller than the tile size. A smaller size may introduce more border artifacts.\n```\npython scripts/sr_val_ddpm_text_T_vqganfin_oldcanvas_tile.py --config configs/stableSRNew/v2-finetune_text_T_512.yaml --ckpt CKPT_PATH --vqgan_ckpt VQGANCKPT_PATH --init-img INPUT_PATH --outdir OUT_DIR --ddpm_steps 200 --dec_w 0.5 --colorfix_type adain\n```\n\n- For test on 768 model, you need to set ```--config configs/stableSRNew/v2-finetune_text_T_768v.yaml```, ```--input_size 768``` and ```--ckpt```. You can also adjust ```--tile_overlap```, ```--vqgantile_size``` and ```--vqgantile_stride``` accordingly. We did not finetune CFW.\n\n#### Test FaceSR\nYou need to first generate reference images using [[CodeFormer](https://github.com/sczhou/CodeFormer)] or other blind face models.   \nPretrained Models: [[HuggingFace](https://huggingface.co/Iceclear/StableSR/blob/main/README.md) | [OpenXLab](https://openxlab.org.cn/models/detail/Iceclear/StableSR)].\n```\npython scripts/sr_val_ddpm_text_T_vqganfin_facerefersampling.py --init-img LR_PATH --ref-img REF_PATH --outdir OUTDIR --config ./configs/stableSRNew/v2-finetune_face_T_512.yaml --ckpt face_stablesr_000050.ckpt\n --vqgan_ckpt face_vqgan_cfw_00011.ckpt --ddpm_steps 200 --dec_w 0.0 --facesr\n```\n\n#### Test using Replicate API\n```\nimport replicate\nmodel = replicate.models.get(\u003cmodel_name\u003e)\nmodel.predict(input_image=...)\n```\nYou may see [here](https://replicate.com/cjwbw/stablesr/api) for more information.\n\n### Citation\nIf our work is useful for your research, please consider citing:\n\n    @article{wang2024exploiting,\n      author = {Wang, Jianyi and Yue, Zongsheng and Zhou, Shangchen and Chan, Kelvin C.K. and Loy, Chen Change},\n      title = {Exploiting Diffusion Prior for Real-World Image Super-Resolution},\n      article = {International Journal of Computer Vision},\n      year = {2024}\n    }\n\n### License\n\nThis project is licensed under \u003ca rel=\"license\" href=\"https://github.com/IceClear/StableSR/blob/main/LICENSE.txt\"\u003eNTU S-Lab License 1.0\u003c/a\u003e. Redistribution and use should follow this license.\n\n### Acknowledgement\n\nThis project is based on [stablediffusion](https://github.com/Stability-AI/stablediffusion), [latent-diffusion](https://github.com/CompVis/latent-diffusion), [SPADE](https://github.com/NVlabs/SPADE), [mixture-of-diffusers](https://github.com/albarji/mixture-of-diffusers) and [BasicSR](https://github.com/XPixelGroup/BasicSR). Thanks for their awesome work.\n\n### Contact\nIf you have any questions, please feel free to reach me out at `iceclearwjy@gmail.com`.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ficeclear%2Fstablesr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ficeclear%2Fstablesr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ficeclear%2Fstablesr/lists"}