Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/wangjiangshan0725/RF-Solver-Edit
Taming FLUX for Image Inversion & Editing; OpenSora for Video Inversion & Editing! (Official implementation for Taming Rectified Flow for Inversion and Editing.)
https://github.com/wangjiangshan0725/RF-Solver-Edit
diffusion-transformer flux image-editing image-inversion opensora rectified-flow video-editing video-inversion
Last synced: about 3 hours ago
JSON representation
Taming FLUX for Image Inversion & Editing; OpenSora for Video Inversion & Editing! (Official implementation for Taming Rectified Flow for Inversion and Editing.)
- Host: GitHub
- URL: https://github.com/wangjiangshan0725/RF-Solver-Edit
- Owner: wangjiangshan0725
- License: apache-2.0
- Created: 2024-11-05T07:48:09.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2024-12-05T08:17:43.000Z (about 1 month ago)
- Last Synced: 2024-12-05T09:26:36.038Z (about 1 month ago)
- Topics: diffusion-transformer, flux, image-editing, image-inversion, opensora, rectified-flow, video-editing, video-inversion
- Language: Python
- Homepage: https://rf-solver-edit.github.io/
- Size: 36 MB
- Stars: 282
- Watchers: 10
- Forks: 7
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-diffusion-categorized - [Code
README
# Taming Rectified Flow for Inversion and Editing[Jiangshan Wang](https://scholar.google.com/citations?user=HoKoCv0AAAAJ&hl=en)1,2, [Junfu Pu](https://pujunfu.github.io/)2, [Zhongang Qi](https://scholar.google.com/citations?hl=en&user=zJvrrusAAAAJ&view_op=list_works&sortby=pubdate)2, [Jiayi Guo](https://www.jiayiguo.net)1, [Yue Ma](https://mayuelala.github.io/)3,
[Nisha Huang](https://scholar.google.com/citations?user=wTmPkSsAAAAJ&hl=en)1, [Yuxin Chen](https://scholar.google.com/citations?hl=en&user=dEm4OKAAAAAJ)2, [Xiu Li](https://scholar.google.com/citations?user=Xrh1OIUAAAAJ&hl=en&oi=ao)1, [Ying Shan](https://scholar.google.com/citations?hl=en&user=4oXBp9UAAAAJ&view_op=list_works&sortby=pubdate)21 Tsinghua University, 2 Tencent ARC Lab, 3 HKUST
[![arXiv](https://img.shields.io/badge/arXiv-RFSolverEdit-b31b1b.svg)](https://arxiv.org/abs/2411.04746)
[![Huggingface space](https://img.shields.io/badge/🤗-Huggingface%20Space-orange.svg)](https://huggingface.co/spaces/wjs0725/RF-Solver-Edit)
[![ComfyUI](https://img.shields.io/badge/ComfyUI-Demo-blue.svg)](https://github.com/logtd/ComfyUI-Fluxtapoz)
We propose RF-Solver to solve the rectified flow ODE with less error, thus enhancing both sampling quality and inversion-reconstruction accuracy for rectified-flow-based generative models. Furthermore, we propose RF-Edit to leverage the RF-Solver for image and video editing tasks. Our methods achieve impressive performance on various tasks, including text-to-image generation, image/video inversion, and image/video editing.
# 🔥 News
- [2024.11.30] Our demo is available on 🤗 [Huggingface Space](https://huggingface.co/spaces/wjs0725/RF-Solver-Edit)!
- [2024.11.18] More examples for style transfer are available!
- [2024.11.18] Gradio Demo for image editing is available!
- [2024.11.16] Thanks to @[logtd](https://github.com/logtd) for integrating RF-Solver into [ComfyUI](https://github.com/logtd/ComfyUI-Fluxtapoz)!
- [2024.11.11] The [homepage](https://rf-solver-edit.github.io/) of the project is available!
- [2024.11.08] Code for image editing is released!
- [2024.11.08] Paper released!# 👨💻 ToDo
- ☑️ Release the gradio demo
- ☑️ Release scripts for more image editing cases
- ☐ Release the code for video editing# 📖 Method
## RF-Solver
We derive the exact formulation of the solution for Rectified Flow ODE. The non-linear part in this solution is processed by Taylor Expansion. Through higher order expansion, the approximation error in the solution is significantly reduced, thus achieving impressive performance on both text-to-image sampling and image/video inversion.## RF-Edit
Based on RF-Solver, we further propose the RF-Edit for image and video editing. RF-Edit framework leverages the features from inversion in the denoising process, which enables high-quality editing while preserving the structural information of source image/video. RF-Edit contains two sub-modules, especially for image editing and video editing.# 🛠️ Code Setup
The environment of our code is the same as FLUX, you can refer to the [official repo](https://github.com/black-forest-labs/flux/tree/main) of FLUX, or running the following command to construct the environment.
```
conda create --name RF-Solver-Edit python=3.10
conda activate RF-Solver-Edit
pip install -e ".[all]"
```
# 🚀 Examples for Image Editing
We have provided several scripts to reproduce the results in the paper, mainly including 3 types of editing: Stylization, Adding, Replacing. We suggest to run the experiment on a single A100 GPU.## Stylization
Ref Style
Editing Scripts
Trump
Marilyn Monroe
EinsteinEdtied image
Editing Scripts
Biden
Batman
Herry PotterEdtied image
## Adding & Replacing
Source image
Editing Scripts
+ hiking stick
horse -> camel
+ dogEdtied image
# 🪄 Edit Your Own Image
## Gradio Demo
We provide the gradio demo for image editing, which is also available on our 🤗 [Huggingface Space](https://huggingface.co/spaces/wjs0725/RF-Solver-Edit)! You can also run the gradio demo on your own device using the following command:
```
cd src
python gradio_demo.py
```
Here is an example of using the gradio demo to edit an image! Note that here "Number of inject steps" means the steps of feature sharing in RF-Edit, which is highly related to the quality of edited results. We suggest tuning this parameter, and selecting the results with the best visual quality.
## Command Line
You can also run the following scripts to edit your own image.
```
cd src
python edit.py --source_prompt [describe the content of your image or leave it as null] \
--target_prompt [describe your editing requirements] \
--guidance 2 \
--source_img_dir [the path of your source image] \
--num_steps 30 \
--inject [typically set to a number between 2 to 8] \
--name 'flux-dev' --offload \
--output_dir [output path]
```
Similarly, The ```--inject``` refers to the steps of feature sharing in RF-Edit, which is highly related to the performance of editing.# 🖼️ Gallery
## Inversion and Reconstruction
## Image Stylization
## Image Editing
## Video Editing
# 🖋️ Citation
If you find our work helpful, please **star 🌟** this repo and **cite 📑** our paper. Thanks for your support!
```
@article{wang2024taming,
title={Taming Rectified Flow for Inversion and Editing},
author={Wang, Jiangshan and Pu, Junfu and Qi, Zhongang and Guo, Jiayi and Ma, Yue and Huang, Nisha and Chen, Yuxin and Li, Xiu and Shan, Ying},
journal={arXiv preprint arXiv:2411.04746},
year={2024}
}
```# Acknowledgements
We thank [FLUX](https://github.com/black-forest-labs/flux/tree/main) for their clean codebase.# Contact
The code in this repository is still being reorganized. Errors that may arise during the organizing process could lead to code malfunctions or discrepancies from the original research results. If you have any questions or concerns, please send emails to [email protected].