{"id":13487822,"url":"https://github.com/Yujun-Shi/DragDiffusion","last_synced_at":"2025-03-27T23:31:48.117Z","repository":{"id":180063969,"uuid":"663801546","full_name":"Yujun-Shi/DragDiffusion","owner":"Yujun-Shi","description":"[CVPR2024, Highlight] Official code for DragDiffusion","archived":false,"fork":false,"pushed_at":"2024-01-29T13:25:43.000Z","size":26534,"stargazers_count":1202,"open_issues_count":36,"forks_count":92,"subscribers_count":26,"default_branch":"main","last_synced_at":"2025-03-26T06:05:44.153Z","etag":null,"topics":["artificial-intelligence","cvpr2024","diffusion-models","dragdiffusion","draggan","image-editing"],"latest_commit_sha":null,"homepage":"https://yujun-shi.github.io/projects/dragdiffusion.html","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Yujun-Shi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-07-08T06:11:13.000Z","updated_at":"2025-03-25T02:16:16.000Z","dependencies_parsed_at":null,"dependency_job_id":"eb729924-9af2-4e41-a188-70f1bae8c124","html_url":"https://github.com/Yujun-Shi/DragDiffusion","commit_stats":null,"previous_names":["yujun-shi/dragdiffusion"],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Yujun-Shi%2FDragDiffusion","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Yujun-Shi%2FDragDiffusion/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Yujun-Shi%2FDragDiffusion/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Yujun-Shi%2FDragDiffusion/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Yujun-Shi","download_url":"https://codeload.github.com/Yujun-Shi/DragDiffusion/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245944020,"owners_count":20697945,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","cvpr2024","diffusion-models","dragdiffusion","draggan","image-editing"],"created_at":"2024-07-31T18:01:04.575Z","updated_at":"2025-03-27T23:31:43.102Z","avatar_url":"https://github.com/Yujun-Shi.png","language":"Python","funding_links":[],"categories":["Personalized Restoration","图像生成","⭐ Community Favorites \u0026 Top Repos"],"sub_categories":["资源传输下载","🎭 Specialized Extensions"],"readme":"\u003cp align=\"center\"\u003e\n  \u003ch1 align=\"center\"\u003eDragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing\u003c/h1\u003e\n  \u003cp align=\"center\"\u003e\n    \u003ca href=\"https://yujun-shi.github.io/\"\u003e\u003cstrong\u003eYujun Shi\u003c/strong\u003e\u003c/a\u003e\n    \u0026nbsp;\u0026nbsp;\n    \u003cstrong\u003eChuhui Xue\u003c/strong\u003e\n    \u0026nbsp;\u0026nbsp;\n    \u003cstrong\u003eJun Hao Liew\u003c/strong\u003e\n    \u0026nbsp;\u0026nbsp;\n    \u003cstrong\u003eJiachun Pan\u003c/strong\u003e\n    \u0026nbsp;\u0026nbsp;\n    \u003cbr\u003e\n    \u003cstrong\u003eHanshu Yan\u003c/strong\u003e\n    \u0026nbsp;\u0026nbsp;\n    \u003cstrong\u003eWenqing Zhang\u003c/strong\u003e\n    \u0026nbsp;\u0026nbsp;\n    \u003ca href=\"https://vyftan.github.io/\"\u003e\u003cstrong\u003eVincent Y. F. Tan\u003c/strong\u003e\u003c/a\u003e\n    \u0026nbsp;\u0026nbsp;\n    \u003ca href=\"https://songbai.site/\"\u003e\u003cstrong\u003eSong Bai\u003c/strong\u003e\u003c/a\u003e\n  \u003c/p\u003e\n  \u003cbr\u003e\n  \u003cdiv align=\"center\"\u003e\n    \u003cimg src=\"./release-doc/asset/counterfeit-1.png\", width=\"700\"\u003e\n    \u003cimg src=\"./release-doc/asset/counterfeit-2.png\", width=\"700\"\u003e\n    \u003cimg src=\"./release-doc/asset/majix_realistic.png\", width=\"700\"\u003e\n  \u003c/div\u003e\n  \u003cdiv align=\"center\"\u003e\n    \u003cimg src=\"./release-doc/asset/github_video.gif\", width=\"700\"\u003e\n  \u003c/div\u003e\n  \u003cp align=\"center\"\u003e\n    \u003ca href=\"https://arxiv.org/abs/2306.14435\"\u003e\u003cimg alt='arXiv' src=\"https://img.shields.io/badge/arXiv-2306.14435-b31b1b.svg\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://yujun-shi.github.io/projects/dragdiffusion.html\"\u003e\u003cimg alt='page' src=\"https://img.shields.io/badge/Project-Website-orange\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://twitter.com/YujunPeiyangShi\"\u003e\u003cimg alt='Twitter' src=\"https://img.shields.io/twitter/follow/YujunPeiyangShi?label=%40YujunPeiyangShi\"\u003e\u003c/a\u003e\n  \u003c/p\u003e\n  \u003cbr\u003e\n\u003c/p\u003e\n\n## Disclaimer\nThis is a research project, NOT a commercial product. Users are granted the freedom to create images using this tool, but they are expected to comply with local laws and utilize it in a responsible manner. The developers do not assume any responsibility for potential misuse by users.\n\n## News and Update\n* [Jan 29th] Update to support diffusers==0.24.0!\n* [Oct 23rd] Code and data of DragBench are released! Please check README under \"drag_bench_evaluation\" for details.\n* [Oct 16th] Integrate [FreeU](https://chenyangsi.top/FreeU/) when dragging generated image.\n* [Oct 3rd] Speeding up LoRA training when editing real images. (**Now only around 20s on A100!**)\n* [Sept 3rd] v0.1.0 Release.\n  * Enable **Dragging Diffusion-Generated Images.**\n  * Introducing a new guidance mechanism that **greatly improve quality of dragging results.** (Inspired by [MasaCtrl](https://ljzycmd.github.io/projects/MasaCtrl/))\n  * Enable Dragging Images with arbitrary aspect ratio\n  * Adding support for DPM++Solver (Generated Images)\n* [July 18th] v0.0.1 Release.\n  * Integrate LoRA training into the User Interface. **No need to use training script and everything can be conveniently done in UI!**\n  * Optimize User Interface layout.\n  * Enable using better VAE for eyes and faces (See [this](https://stable-diffusion-art.com/how-to-use-vae/))\n* [July 8th] v0.0.0 Release.\n  * Implement Basic function of DragDiffusion\n\n## Installation\n\nIt is recommended to run our code on a Nvidia GPU with a linux system. We have not yet tested on other configurations. Currently, it requires around 14 GB GPU memory to run our method. We will continue to optimize memory efficiency\n\nTo install the required libraries, simply run the following command:\n```\nconda env create -f environment.yaml\nconda activate dragdiff\n```\n\n## Run DragDiffusion\nTo start with, in command line, run the following to start the gradio user interface:\n```\npython3 drag_ui.py\n```\n\nYou may check our [GIF above](https://github.com/Yujun-Shi/DragDiffusion/blob/main/release-doc/asset/github_video.gif) that demonstrate the usage of UI in a step-by-step manner.\n\nBasically, it consists of the following steps:\n\n### Case 1: Dragging Input Real Images\n#### 1) train a LoRA\n* Drop our input image into the left-most box.\n* Input a prompt describing the image in the \"prompt\" field\n* Click the \"Train LoRA\" button to train a LoRA given the input image\n\n#### 2) do \"drag\" editing\n* Draw a mask in the left-most box to specify the editable areas.\n* Click handle and target points in the middle box. Also, you may reset all points by clicking \"Undo point\".\n* Click the \"Run\" button to run our algorithm. Edited results will be displayed in the right-most box.\n\n### Case 2: Dragging Diffusion-Generated Images\n#### 1) generate an image\n* Fill in the generation parameters (e.g., positive/negative prompt, parameters under Generation Config \u0026 FreeU Parameters).\n* Click \"Generate Image\".\n\n#### 2) do \"drag\" on the generated image\n* Draw a mask in the left-most box to specify the editable areas\n* Click handle points and target points in the middle box.\n* Click the \"Run\" button to run our algorithm. Edited results will be displayed in the right-most box.\n\n\n\u003c!---\n## Explanation for parameters in the user interface:\n#### General Parameters\n|Parameter|Explanation|\n|-----|------|\n|prompt|The prompt describing the user input image (This will be used to train the LoRA and conduct \"drag\" editing).|\n|lora_path|The directory where the trained LoRA will be saved.|\n\n\n#### Algorithm Parameters\nThese parameters are collapsed by default as we normally do not have to tune them. Here are the explanations:\n* Base Model Config\n\n|Parameter|Explanation|\n|-----|------|\n|Diffusion Model Path|The path to the diffusion models. By default we are using \"runwayml/stable-diffusion-v1-5\". We will add support for more models in the future.|\n|VAE Choice|The Choice of VAE. Now there are two choices, one is \"default\", which will use the original VAE. Another choice is \"stabilityai/sd-vae-ft-mse\", which can improve results on images with human eyes and faces (see [explanation](https://stable-diffusion-art.com/how-to-use-vae/))|\n\n* Drag Parameters\n\n|Parameter|Explanation|\n|-----|------|\n|n_pix_step|Maximum number of steps of motion supervision. **Increase this if handle points have not been \"dragged\" to desired position.**|\n|lam|The regularization coefficient controlling unmasked region stays unchanged. Increase this value if the unmasked region has changed more than what was desired (do not have to tune in most cases).|\n|n_actual_inference_step|Number of DDIM inversion steps performed (do not have to tune in most cases).|\n\n* LoRA Parameters\n\n|Parameter|Explanation|\n|-----|------|\n|LoRA training steps|Number of LoRA training steps (do not have to tune in most cases).|\n|LoRA learning rate|Learning rate of LoRA (do not have to tune in most cases)|\n|LoRA rank|Rank of the LoRA (do not have to tune in most cases).|\n\n---\u003e\n\n## License\nCode related to the DragDiffusion algorithm is under Apache 2.0 license.\n\n\n## BibTeX\nIf you find our repo helpful, please consider leaving a star or cite our paper :)\n```bibtex\n@article{shi2023dragdiffusion,\n  title={DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing},\n  author={Shi, Yujun and Xue, Chuhui and Pan, Jiachun and Zhang, Wenqing and Tan, Vincent YF and Bai, Song},\n  journal={arXiv preprint arXiv:2306.14435},\n  year={2023}\n}\n```\n\n## Contact\nFor any questions on this project, please contact [Yujun](https://yujun-shi.github.io/) (shi.yujun@u.nus.edu)\n\n## Acknowledgement\nThis work is inspired by the amazing [DragGAN](https://vcai.mpi-inf.mpg.de/projects/DragGAN/). The lora training code is modified from an [example](https://github.com/huggingface/diffusers/blob/v0.17.1/examples/dreambooth/train_dreambooth_lora.py) of diffusers. Image samples are collected from [unsplash](https://unsplash.com/), [pexels](https://www.pexels.com/zh-cn/), [pixabay](https://pixabay.com/). Finally, a huge shout-out to all the amazing open source diffusion models and libraries.\n\n## Related Links\n* [Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold](https://vcai.mpi-inf.mpg.de/projects/DragGAN/)\n* [MasaCtrl: Tuning-free Mutual Self-Attention Control for Consistent Image Synthesis and Editing](https://ljzycmd.github.io/projects/MasaCtrl/)\n* [Emergent Correspondence from Image Diffusion](https://diffusionfeatures.github.io/)\n* [DragonDiffusion: Enabling Drag-style Manipulation on Diffusion Models](https://mc-e.github.io/project/DragonDiffusion/)\n* [FreeDrag: Point Tracking is Not You Need for Interactive Point-based Image Editing](https://lin-chen.site/projects/freedrag/)\n\n\n## Common Issues and Solutions\n1) For users struggling in loading models from huggingface due to internet constraint, please 1) follow this [links](https://zhuanlan.zhihu.com/p/475260268) and download the model into the directory \"local\\_pretrained\\_models\"; 2) Run \"drag\\_ui.py\" and select the directory to your pretrained model in \"Algorithm Parameters -\u003e Base Model Config -\u003e Diffusion Model Path\".\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FYujun-Shi%2FDragDiffusion","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FYujun-Shi%2FDragDiffusion","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FYujun-Shi%2FDragDiffusion/lists"}