{"id":17513463,"url":"https://github.com/zituitui/belm","last_synced_at":"2025-04-06T07:09:56.005Z","repository":{"id":258151912,"uuid":"871524030","full_name":"zituitui/BELM","owner":"zituitui","description":"[NeurIPS 2024] Official implementation of \"BELM: Bidirectional Explicit Linear Multi-step Sampler for Exact Inversion in Diffusion Models\".","archived":false,"fork":false,"pushed_at":"2024-11-25T07:22:09.000Z","size":51061,"stargazers_count":122,"open_issues_count":7,"forks_count":7,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-03-30T06:03:57.398Z","etag":null,"topics":["diffusion-models","image-editing","neurips-2024","numerical-odes","text-to-image-generation"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zituitui.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-10-12T07:45:16.000Z","updated_at":"2025-03-24T08:21:12.000Z","dependencies_parsed_at":"2024-10-17T21:17:12.087Z","dependency_job_id":"77589d2d-d556-4804-a5b3-d4f435483fe6","html_url":"https://github.com/zituitui/BELM","commit_stats":null,"previous_names":["zituitui/belm"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zituitui%2FBELM","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zituitui%2FBELM/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zituitui%2FBELM/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zituitui%2FBELM/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zituitui","download_url":"https://codeload.github.com/zituitui/BELM/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247445667,"owners_count":20939958,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["diffusion-models","image-editing","neurips-2024","numerical-odes","text-to-image-generation"],"created_at":"2024-10-20T06:26:32.040Z","updated_at":"2025-04-06T07:09:55.981Z","avatar_url":"https://github.com/zituitui.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# BELM: High-quality Exact Inversion sampler of Diffusion Models 🏆\n\n\u003cdiv align=\"center\"\u003e\n\nThis repository is the official implementation of the **NeurIPS 2024** paper:\n_\"BELM: Bidirectional Explicit Linear Multi-step Sampler for Exact Inversion in Diffusion Models\"_ \n\nKeywords: Diffusion Model, Exact Inversion, ODE Solver\n\n\u003e **Fangyikang Wang\u003csup\u003e1\u003c/sup\u003e, Hubery Yin\u003csup\u003e2\u003c/sup\u003e, Yuejiang Dong\u003csup\u003e3\u003c/sup\u003e, Huminhao Zhu\u003csup\u003e1\u003c/sup\u003e, \u003cbr\u003e Chao Zhang\u003csup\u003e1\u003c/sup\u003e, Hanbin Zhao\u003csup\u003e1\u003c/sup\u003e, Hui Qian\u003csup\u003e1\u003c/sup\u003e, Chen Li\u003csup\u003e2\u003c/sup\u003e**\n\u003e \n\u003e \u003csup\u003e1\u003c/sup\u003eZhejiang University \u003csup\u003e2\u003c/sup\u003eWeChat, Tencent Inc. \u003csup\u003e3\u003c/sup\u003eTsinghua University\n\n[![arXiv](https://img.shields.io/badge/arXiv%20paper-2410.07273-b31b1b.svg)](https://arxiv.org/abs/2410.07273)\u0026nbsp;\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\u0026nbsp;\n[![Zhihu](https://img.shields.io/badge/zhihu-%E7%9F%A5%E4%B9%8E-informational.svg)](https://zhuanlan.zhihu.com/p/1379396199)\u0026nbsp;\n[![Hits](https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fgithub.com%2Fzituitui%2FBELM\u0026count_bg=%2379C83D\u0026title_bg=%23555555\u0026icon=\u0026icon_color=%23E7E7E7\u0026title=Visitors\u0026edge_flat=false)](https://hits.seeyoufarm.com)\n\n\n\n\u003c/div\u003e\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"assets/recon.jpg\" alt=\"Recon Results\" style=\"width:50%;\"/\u003e\n\u003c/p\u003e\n\n\n\u003c!-- \u003e Image Editing Results --\u003e\n\u003c!-- \u003e Images editing: --\u003e\n![Interpolation Results](assets/editing_show.drawio.png)\n\u003c!-- \u003cp align=\"center\"\u003e \n    \u003cimg src=\"assets/editing_show.drawio.png\" alt=\"Image Editing Results\" width=\"80%\"\u003e \n\u003cp\u003e --\u003e\n\n\u003c!-- \u003e Interpolation Results --\u003e\n![Interpolation Results](assets/belm_inter_show.drawio.png)\n\u003c!-- #### Image interpolation: --\u003e\n\u003c!-- \u003cp align=\"center\"\u003e \n    \u003cimg src=\"assets/belm_inter_show.drawio.png\" alt=\"Image Editing Results\" width=\"80%\"\u003e \n\u003cp\u003e --\u003e\n\u003c!-- ## Abstract\n\nThe inversion of diffusion model sampling, which aims to find the corresponding initial noise of a sample, plays a critical role in various tasks. Recently, several heuristic exact inversion samplers have been proposed to address the inexact inversion issue in a training-free manner. However, the theoretical properties of these heuristic samplers remain unknown and they often exhibit mediocre sampling quality. In this paper, we introduce a generic formulation, \\emph{Bidirectional Explicit Linear Multi-step} (BELM) samplers, of the exact inversion samplers, which includes all previously proposed heuristic exact inversion samplers as special cases. The BELM formulation is derived from the variable-stepsize-variable-formula linear multi-step method via integrating a bidirectional explicit constraint. We highlight this bidirectional explicit constraint is the key of mathematically exact inversion. We systematically investigate the Local Truncation Error (LTE) within the BELM framework and show that the existing heuristic designs of exact inversion samplers yield sub-optimal LTE. Consequently, we propose the Optimal BELM (O-BELM) sampler through the LTE minimization approach. We conduct additional analysis to substantiate the theoretical stability and global convergence property of the proposed optimal sampler. Comprehensive experiments demonstrate our O-BELM sampler establishes the exact inversion property while achieving high-quality sampling. Additional experiments in image editing and image interpolation highlight the extensive potential of applying O-BELM in varying applications.  --\u003e\n\n\n\n\n\n## 🆕 What's New?\n### 🔥 We use the thought of bidirectional explicit to enable exact inversion\n![Some edits](assets/belm_linear.drawio.png)\n\u003e **Schematic description** of DDIM (left) and BELM (right). DDIM uses $`\\mathbf{x}_i`$ and $`\\boldsymbol{\\varepsilon}_\\theta(\\mathbf{x}_i,i)`$ to calculate $`\\mathbf{x}_{i-1}`$ based on a linear relation between $`\\mathbf{x}_i`$, $`\\mathbf{x}_{i-1}`$ and $`\\boldsymbol{\\varepsilon}_\\theta(\\mathbf{x}_i,i)`$ (represented by the \u003cspan style=\"color:blue\"\u003eblue line\u003c/span\u003e). However, DDIM inversion uses $`\\mathbf{x}_{i-1}`$ and $`\\boldsymbol{\\varepsilon}_\\theta(\\mathbf{x}_{i-1},i-1)`$ to calculate $`\\mathbf{x}_{i}`$ based on a different linear relation represented by the \u003cspan style=\"color:red\"\u003ered line\u003c/span\u003e. This mismatch leads to the inexact inversion of DDIM. In contrast, BELM seeks to establish a linear relation between $`\\mathbf{x}_{i-1}`$, $`\\mathbf{x}_i`$, $`\\mathbf{x}_{i+1}`$ and $`\\boldsymbol{\\varepsilon}_\\theta(\\mathbf{x}_{i}, i)`$ (represented by the \u003cspan style=\"color:green\"\u003egreen line\u003c/span\u003e). BELM and its inversion are derived from this unitary relation, which facilitates the exact inversion. Specifically, BELM uses the linear combination of $`\\mathbf{x}_i`$, $`\\mathbf{x}_{i+1}`$ and $`\\boldsymbol{\\varepsilon}_\\theta(\\mathbf{x}_{i},i)`$ to calculate $`\\mathbf{x}_{i-1}`$, and the BELM inversion uses the linear combination of $`\\mathbf{x}_{i-1}`$, $`\\mathbf{x}_i`$ and $`\\boldsymbol{\\varepsilon}_\\theta(\\mathbf{x}_{i},i)`$ to calculate $`\\mathbf{x}_{i+1}`$. The bidirectional explicit constraint means this linear relation does not include the derivatives at the bidirectional endpoint, that is, $`\\boldsymbol{\\varepsilon}_\\theta(\\mathbf{x}_{i-1},i-1)`$ and $`\\boldsymbol{\\varepsilon}_\\theta(\\mathbf{x}_{i+1},i+1)`$.\n\n### 🔥 We introduce a generic formulation of the exact inversion samplers, BELM.\n\u003c!-- ![Some edits](assets/belm.jpg)\n![Some edits](assets/2-belm.jpg) --\u003e\nthe general k-step BELM:\n```math\n\\bar{\\mathbf{x}}_{i-1} = \\sum_{j=1}^{k} a_{i,j}\\cdot \\bar{\\mathbf{x}}_{i-1+j} +\\sum_{j=1}^{k-1}b_{i,j}\\cdot h_{i-1+j}\\cdot\\bar{\\boldsymbol{\\varepsilon}}_\\theta(\\bar{\\mathbf{x}}_{i-1+j},\\bar{\\sigma}_{i-1+j}).\n```\n\n\n2-step BELM:\n```math\n\\bar{\\mathbf{x}}_{i-1} = a_{i,2}\\bar{\\mathbf{x}}_{i+1} +a_{i,1}\\bar{\\mathbf{x}}_{i} + b_{i,1} h_i\\bar{\\boldsymbol{\\varepsilon}}_\\theta(\\bar{\\mathbf{x}}_i,\\bar{\\sigma}_i).\n```\n\n### 🔥 We derive the optimal coefficients for BELM via LTE minimization.\n\u003c!-- ![Some edits](assets/o-belm.jpg) --\u003e\n\n\u003cdiv style=\"background-color: #f0f0f0; padding: 10px; border-radius: 5px;\"\u003e\n\n\u003e **Proposition**  The LTE $`\\tau_i`$ of BELM diffusion sampler, which is given by $`\\tau_i = \\bar{\\mathbf{x}}(t_{i-1}) - a_{i,2}\\bar{\\mathbf{x}}(t_{i+1}) -a_{i,1}\\bar{\\mathbf{x}}(t_{i}) - b_{i,1} h_i\\bar{\\boldsymbol{\\varepsilon}}_\\theta(\\bar{\\mathbf{x}}(t_i),\\bar{\\sigma}_i)`$, can be accurate up to $`\\mathcal{O}\\left({(h_{i}+h_{i+1})}^3\\right)`$ when formulae are designed as $`a_{i,1} = \\frac{h_{i+1}^2 - h_i^2}{h_{i+1}^2}`$,$`a_{i,2}=\\frac{h_i^2}{h_{i+1}^2}`$,$`b_{i,1}=- \\frac{h_i+h_{i+1}}{h_{i+1}} `$.\n\n\u003c/div\u003e\n\nwhere $`h_i = \\frac{\\sigma_i}{\\alpha_i}-\\frac{\\sigma_{i-1}}{\\alpha{i-1}}`$\n\nthe Optimal-BELM (O-BELM) sampler:\n\n```math\n\\mathbf{x}_{i-1} = \\frac{h_i^2}{h_{i+1}^2}\\frac{\\alpha_{i-1}}{\\alpha_{i+1}}\\mathbf{x}_{i+1} +\\frac{h_{i+1}^2 - h_i^2}{h_{i+1}^2}\\frac{\\alpha_{i-1}}{\\alpha_{i}}\\mathbf{x}_{i} - \\frac{h_i(h_i+h_{i+1})}{h_{i+1}}\\alpha_{i-1}\\boldsymbol{\\varepsilon}_\\theta(\\mathbf{x}_i,i).\n```\n\nThe inversion of O-BELM diffusion sampler writes:\n\n```math\n\\mathbf{x}_{i+1}= \\frac{h_{i+1}^2}{h_i^2}\\frac{\\alpha_{i+1}}{\\alpha_{i-1}}\\mathbf{x}_{i-1} + \\frac{h_i^2-h_{i+1}^2}{h_i^2}\\frac{\\alpha_{i+1}}{\\alpha_{i}}\\mathbf{x}_{i}+\\frac{h_{i+1}(h_i+h_{i+1})}{h_i}\\alpha_{i+1} \\boldsymbol{\\varepsilon}_\\theta(\\mathbf{x}_i,i).\n```\n\n## 👨🏻‍💻 Run the code \n\n### 1) Get start\n\n* Python 3.8.12\n* CUDA 11.7\n* NVIDIA A100 40GB PCIe\n* Torch 2.0.0\n* Torchvision 0.14.0\n\nPlease follow **[diffusers](https://github.com/huggingface/diffusers)** to install diffusers.\n\n### 2) Run\nfirst, please switch to the root directory.\n#### CIFAR10 sampling\n```shell\npython3 ./scripts/cifar10.py --test_num 10 --batch_size 32 --num_inference_steps 100 --sampler_type belm --save_dir YOUR/SAVE/DIR --model_id xxx/ddpm_ema_cifar10\n```\n\n#### CelebA-HQ sampling\n```shell\npython3 ./scripts/celeba.py --test_num 10 --batch_size 32 --num_inference_steps 100 --sampler_type belm --save_dir YOUR/SAVE/DIR --model_id xxx/ddpm_ema_cifar10\n```\n\n#### FID evaluation\n```shell\npython3 ./scripts/celeba.py --test_num 10 --batch_size 32 --num_inference_steps 100 --sampler_type belm --save_dir YOUR/SAVE/DIR --model_id xxx/ddpm_ema_cifar10\n```\n\n#### intrpolation\n```shell\npython3 ./scripts/interpolate.py --test_num 10 --batch_size 1 --num_inference_steps 100  --save_dir YOUR/SAVE/DIR --model_id xx\n```\n\n#### Reconstruction error calculation\n```shell\npython3 ./scripts/reconstruction.py --test_num 10 --num_inference_steps 100  --directory WHERE/YOUR/IMAGES/ARE --sampler_type belm\n```\n\n#### Image editing\n```shell\npython3 ./scripts/image_editing.py --num_inference_steps 200 --freeze_step 50 --guidance 2.0  --sampler_type belm --save_dir YOUR/SAVE/DIR --model_id xxxxx/stable-diffusion-v1-5 --ori_im_path images/imagenet_dog_1.jpg --ori_prompt 'A dog' --res_prompt 'A Dalmatian'\n```\n\n\n## 🪪 License\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n\n## 📝 Citation\nIf our work assists your research, feel free to give us a star ⭐ or cite us using:\n```\n@inproceedings{\nwang2024belm,\ntitle={{BELM}: Bidirectional Explicit Linear Multi-step Sampler for Exact Inversion in Diffusion Models},\nauthor={Fangyikang Wang and Hubery Yin and Yue-Jiang Dong and Huminhao Zhu and Chao Zhang and Hanbin Zhao and Hui Qian and Chen Li},\nbooktitle={The Thirty-eighth Annual Conference on Neural Information Processing Systems},\nyear={2024},\nurl={https://openreview.net/forum?id=ccQ4fmwLDb}\n}\n```\n\n## 📩 Contact me\nMy e-mail address:\n```\nwangfangyikang@zju.edu.cn\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzituitui%2Fbelm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzituitui%2Fbelm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzituitui%2Fbelm/lists"}