Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/guoqiang-zhang-x/BDIA


https://github.com/guoqiang-zhang-x/BDIA

Last synced: 3 months ago
JSON representation

Awesome Lists containing this project

README

        

### An Implementation of the BDIA Technique from [Exact Diffusion Inversion via Bi-directional Integration Approximation](https://arxiv.org/abs/2307.10829), (oral), ECCV, 2024, by Guoqiang Zhang, J. P. Lewis, and W. Bastiaan Kleijn

In brief, BDIA is a time-reversible ODE solver that can be applied to improve the performance of both diffusion sampling and diffusion inversion. Suppose we would like to estimate the next diffusion state $`\boldsymbol{z}_{i-1}`$ by solving a probability ordinary differential equation (ODE)
```math
d\boldsymbol{z} = \boldsymbol{d}(\boldsymbol{z},t)dt
```
based on the recent information $`(\boldsymbol{z}_{i},t_i)`$ and $`(\boldsymbol{z}_{i+1},t_{i+1})`$. The basic idea of BDIA is to compute $`\boldsymbol{z}_{i-1}`$ by performing both the forward integration approximation $`\Delta(t_i\rightarrow t_{i-1}|\boldsymbol{z}_i)`$ $`\left(\approx \int_{t_i}^{t_{i-1}}\boldsymbol{d}(\boldsymbol{z},t)dt\right)`$ and the backbackward integration approximation $`\Delta(t_i\rightarrow t_{i+1}|\boldsymbol{z}_i)`$ $`\left(\approx -\int_{t_{i+1}}^{t_{i}}\boldsymbol{d}(\boldsymbol{z},t)dt\right)`$ conditioned on $\boldsymbol{z}_i$. With the above two integration approximations, $`\boldsymbol{z}_{i-1}`$ can be expressed as
```math
\boldsymbol{z}_{i-1} = \boldsymbol{z}_{i+1} \underbrace{- (1-\textcolor{blue}{\gamma}) (\boldsymbol{z}_{i+1}-\boldsymbol{z}_{i}) - \textcolor{blue}{\gamma}\Delta(t_i\rightarrow t_{i+1}|\boldsymbol{z}_i)}_{\approx \int_{t_{i+1}}^{t_{i}}\boldsymbol{d}(\boldsymbol{z},t)dt } + \underbrace{\Delta(t_i\rightarrow t_{i-1}|\boldsymbol{z}_i)}_{ \approx \int_{t_i}^{t_{i-1}}\boldsymbol{d}(\boldsymbol{z},t)dt }
```
where $\gamma=[0,1]$ averages the backward and forward integration approximations for the time-slot $`[t_{i+1},t_i]`$. One nice property of the above update expression is that it is invertiable. That is, $\boldsymbol{z}_{i+1}$ can be represented as an expression of $`\boldsymbol{z}_{i-1}`$ and $`\boldsymbol{z}_{i}`$, which enables exact diffusion inversion.

The BDIA technique can be applied directly to DDIM. In this case, the forward integration approximation $`\Delta(t_i\rightarrow t_{i-1}|\boldsymbol{z}_i)`$ becomes the DDIM updates, which is given by
```math
\Delta(t_i\rightarrow t_{i-1}|\boldsymbol{z}_i) = \alpha_{i-1} \left(\frac{\boldsymbol{z}_{i} \hspace{-0.3mm}-\hspace{-0.3mm} \sigma_{i}\hat{\boldsymbol{\epsilon}}_{\boldsymbol{\theta}}(\boldsymbol{z}_{i}, i) }{\alpha_{i}}\right)+\hspace{0.5mm}\sigma_{i-1}\hat{\boldsymbol{\epsilon}}_{\boldsymbol{\theta}}(\boldsymbol{z}_{i}, i) - \boldsymbol{z}_i
```
Correspondingly, the backward integration approximation $`\Delta(t_i\rightarrow t_{i}|\boldsymbol{z}_{i+1})`$ becomes the DDIM updates, which is given by
```math
\Delta(t_i\rightarrow t_{i+1}|\boldsymbol{z}_{i}) = \alpha_{i+1} \left(\frac{\boldsymbol{z}_{i} \hspace{-0.3mm}-\hspace{-0.3mm} \sigma_{i}\hat{\boldsymbol{\epsilon}}_{\boldsymbol{\theta}}(\boldsymbol{z}_{i}, i) }{\alpha_{i}}\right)+\hspace{0.5mm}\sigma_{i+1}\hat{\boldsymbol{\epsilon}}_{\boldsymbol{\theta}}(\boldsymbol{z}_{i}, i) - \boldsymbol{z}_i
```
As a result, the final update expression of BDIA-DDIM is
```math
\boldsymbol{z}_{i-1} = \gamma \boldsymbol{z}_{i+1}-\gamma\Big[\alpha_{i+1} \Big(\frac{\boldsymbol{z}_{i} \hspace{-0.3mm}-\hspace{-0.3mm} \sigma_{i}\hat{\boldsymbol{\epsilon}}_{\boldsymbol{\theta}}(\boldsymbol{z}_{i}, i) }{\alpha_{i}}\Big)+\hspace{0.5mm}\sigma_{i+1}\hat{\boldsymbol{\epsilon}}_{\boldsymbol{\theta}}(\boldsymbol{z}_{i}, i) \Big] + \Big[\alpha_{i-1} \left(\frac{\boldsymbol{z}_{i} \hspace{-0.3mm}-\hspace{-0.3mm} \sigma_{i}\hat{\boldsymbol{\epsilon}}_{\boldsymbol{\theta}}(\boldsymbol{z}_{i}, i) }{\alpha_{i}}\right)+\hspace{0.5mm}\sigma_{i-1}\hat{\boldsymbol{\epsilon}}_{\boldsymbol{\theta}}(\boldsymbol{z}_{i}, i) \Big]
```

One can also apply the BDIA technique to the EDM and DPM-Solver++ sampling procedures. Experiments on BDIA-EDM show that it outperforms EDM consistently over four pre-trained models in terms of FID scores. The details can be found out in the paper.

-------------------------------------------------------
#### News

Recently, the BDIA technique has also been sucessfully applied to design reversiable transformers.

Source code for round-trip image-editing has been added on Januray 8th, 2024. The code depends heavily on the original source code for EDICT.

Source code for text-to-image over StableDiffusion V2 has been added in October 2023.

-------------------------------------------------------

#### [Images generated by BDIA-DDIM and DDIM for text-to-image generation with 10 timesteps over StableDiffusion V2]

Note: For 10 timesteps, it is preferable to set $\gamma=0.5$ while for 40 timesteps, it is recommanded to set $\gamma=1.0$.

-------------------------------------------------------
#### [Demonstration of controlnet-based image editing by BDIA-DDIM with 10 timesteps]

-------------------------------------------------------
#### [Demonstration of text-based image editing by BDIA-DDIM using StableDiffusion]

### Citation

If you find our work useful in your research, please cite:

```
@MISC{guoqiang2024bdia,
title={Exact Diffusion Inversion via Bi-directional Integration Approximation},
author={G. Zhang and J. P. Lewis and W. B. Kleijn},
howpublished={ECCV},
year={2024}
}
```