https://github.com/chaofengc/ITER

PyTorch codes for "Iterative Token Evaluation and Refinement for Real-World Super-Resolution", AAAI 2024
https://github.com/chaofengc/ITER

Last synced: 3 months ago
JSON representation

PyTorch codes for "Iterative Token Evaluation and Refinement for Real-World Super-Resolution", AAAI 2024

Host: GitHub
URL: https://github.com/chaofengc/ITER
Owner: chaofengc
License: other
Created: 2023-12-10T09:21:37.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-10-23T05:11:35.000Z (8 months ago)
Last Synced: 2025-03-22T22:38:29.688Z (3 months ago)
Language: Python
Homepage:
Size: 6.32 MB
Stars: 56
Watchers: 4
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

awesome-diffusion-categorized - [Code

README

        


## [Iterative Token Evaluation and Refinement for Real-World Super-Resolution](https://arxiv.org/abs/2312.05616)

[¹Chaofeng Chen](https://chaofengc.github.io), [¹Shangchen Zhou](https://shangchenzhou.com/), [¹Liang Liao](https://liaoliang92.github.io/homepage/), [¹Haoning Wu](https://teowu.github.io/), [²Wenxiu Sun](https://scholar.google.com/citations?user=X9lE6O4AAAAJ&hl=en), [²Qiong Yan](https://scholar.google.com/citations?user=uT9CtPYAAAAJ&hl=en), [¹Weisi Lin](https://personal.ntu.edu.sg/wslin/Home.html)  

¹S-Lab, Nanyang Technological University, ²Sensetime Research

[![arXiv](https://img.shields.io/badge/arXiv-Paper-.svg)](https://arxiv.org/abs/2312.05616) ![arXiv](https://img.shields.io/badge/AAAI-2024-red.svg) ![visitors](https://visitor-badge.laobi.icu/badge?page_id=chaofengc/ITER)

![teaser_img](./assets/fig_teaser.jpg)



-----------------------------

![framework_img](assets/fig_framework.jpg)

**Pipeline of ITER.** The input $I_l$ first passes through a distortion removal network $E_l$ to obtain the initially restored tokens $S_l$, which are composed of indexes of the quantized features in the codebook of VQGAN. Then, a reverse discrete diffusion process, conditioned on $S_l$, is used to generate textures. The process starts from completely masked tokens $S_T$. The refinement network (also called the de-masking network) $\phi_r$ generates refined outputs $S_{T-1}$ with $S_l$ as a condition. Then, $\phi_e$ evaluates $S_{T-1}$ to obtain the evaluation mask $m_{T-1}$, which determines the tokens to keep and refine for step $T-1$ through a masked sampling process. Repeat this process $T$ times to obtain de-masked outputs $S_0$, and then reconstruct the restored images $I_{sr}$ using the VQGAN decoder $D_H$. We found that $T\leq8$ is enough to get good results with ITER, which is much more efficient than other diffusion-based approaches.

## 🔧 Dependencies and Installation

```

# git clone this repository

git clone https://github.com/chaofengc/ITER.git

cd ITER 

# create new anaconda env

conda create -n iter python=3.8

source activate iter 

# install python dependencies

pip3 install -r requirements.txt

python setup.py develop

```

## ⚡Quick Inference

```

python inference_iter.py -s 2 -i ./testset/lrx4/frog.jpg

python inference_iter.py -s 4 -i ./testset/lrx4/frog.jpg

```

### Example results

---

**Left**: [real images](./testset) **|** **Right**: [super-resolved images with scale factor 4](./example_results_x4/)

 

 

 

 

## 👨‍💻Train the model

### ⏬ Download Datasets

The training datasets can be downloaded from [🤗hugging face](https://huggingface.co/datasets/chaofengc/ITER). You may also refer to [FeMaSR](https://github.com/chaofengc/FeMaSR) to prepare your own training data. 

### ‍🔁 Training

Below are brief examples for training the model. **Please modify the corresponding configuration files to suit your needs.** *Note that the codes are re-writtend and models are retrained from scratch, so the results may be slightly different from the paper.*

#### Stage I: Train the Swin-VQGAN

```

accelerate launch --multi_gpu --num_processes=8 --mixed_precision=bf16 basicsr/train.py -opt options/train_ITER_HQ_stage.yml

```

#### Stage II & III: Train the LQ encoder and the refinement network

``` 

accelerate launch --main_process_port=29600 --multi_gpu --num_processes=8 --mixed_precision=bf16 basicsr/train.py -opt options/train_ITER_LQ_stage_X2.yml

accelerate launch --main_process_port=29600 --multi_gpu --num_processes=8 --mixed_precision=bf16 basicsr/train.py -opt options/train_ITER_LQ_stage_X4.yml

```

## 📝 Citation

If you find this code useful for your research, please cite our paper:

```

@inproceedings{chen2024iter,

  title={Iterative Token Evaluation and Refinement for Real-World Super-Resolution},

  author={Chaofeng Chen and Shangchen Zhou and Liang Liao and Haoning Wu and Wenxiu Sun and Qiong Yan and Weisi Lin},

  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},

  year={2024},

}

```

## ⚖️ License


This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License and [NTU S-Lab License 1.0](./LICENCE_S-Lab).

## ❤️ Acknowledgement

This project is based on [BasicSR](https://github.com/xinntao/BasicSR).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/chaofengc/ITER

Awesome Lists containing this project

README