{"id":29009901,"url":"https://github.com/tencentarc/brushedit","last_synced_at":"2025-06-25T15:33:39.930Z","repository":{"id":268414062,"uuid":"904060563","full_name":"TencentARC/BrushEdit","owner":"TencentARC","description":"[TPAMI under review] The official implementation of paper \"BrushEdit: All-In-One Image Inpainting and Editing\"","archived":false,"fork":false,"pushed_at":"2024-12-26T07:48:55.000Z","size":55917,"stargazers_count":547,"open_issues_count":10,"forks_count":26,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-04-11T16:50:41.768Z","etag":null,"topics":["diffusion-models","image-editing","image-inpainting"],"latest_commit_sha":null,"homepage":"https://liyaowei-stu.github.io/project/BrushEdit/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/TencentARC.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-12-16T07:17:15.000Z","updated_at":"2025-04-11T13:26:08.000Z","dependencies_parsed_at":"2025-04-11T17:02:33.179Z","dependency_job_id":null,"html_url":"https://github.com/TencentARC/BrushEdit","commit_stats":null,"previous_names":["tencentarc/brushedit"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/TencentARC/BrushEdit","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TencentARC%2FBrushEdit","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TencentARC%2FBrushEdit/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TencentARC%2FBrushEdit/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TencentARC%2FBrushEdit/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/TencentARC","download_url":"https://codeload.github.com/TencentARC/BrushEdit/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TencentARC%2FBrushEdit/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261901407,"owners_count":23227593,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["diffusion-models","image-editing","image-inpainting"],"created_at":"2025-06-25T15:33:38.775Z","updated_at":"2025-06-25T15:33:39.895Z","avatar_url":"https://github.com/TencentARC.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# BrushEdit\n\n😃 This repository contains the implementation of \"BrushEdit: All-In-One Image Inpainting and Editing\".\n\nKeywords: Image Inpainting, Image Generation, Image Editing, Diffusion Models, MLLM Agent, Instruction-basd Editing\n\n\u003e TL;DR: BrushEdit is an advanced, unified AI agent for image inpainting and editing. \u003cbr\u003e\n\u003e Main Elements: 🛠️ Fully automated / 🤠 Interactive editing.\n\n\u003e[Yaowei Li](https://github.com/liyaowei-stu)\u003csup\u003e1*\u003c/sup\u003e, [Yuxuan Bian](https://yxbian23.github.io/)\u003csup\u003e3*\u003c/sup\u003e, [Xuan Ju](https://github.com/juxuan27)\u003csup\u003e3*\u003c/sup\u003e, [Zhaoyang Zhang](https://zzyfd.github.io/#/)\u003csup\u003e2‡\u003c/sup\u003e, [Junhao Zhuang](https://github.com/zhuang2002)\u003csup\u003e4\u003c/sup\u003e, [Ying Shan](https://www.linkedin.com/in/YingShanProfile/)\u003csup\u003e2✉\u003c/sup\u003e, [Yuexian Zou](https://www.ece.pku.edu.cn/info/1046/2146.htm)\u003csup\u003e1✉\u003c/sup\u003e\u003cbr\u003e, [Qiang Xu](https://cure-lab.github.io/)\u003csup\u003e3✉\u003c/sup\u003e\u003cbr\u003e\n\u003e\u003csup\u003e1\u003c/sup\u003ePeking University \u003csup\u003e2\u003c/sup\u003eARC Lab, Tencent PCG  \u003csup\u003e3\u003c/sup\u003eThe Chinese University of Hong Kong \u003csup\u003e4\u003c/sup\u003eTsinghua University \u003cbr\u003e \u003csup\u003e*\u003c/sup\u003eEqual Contribution \u003csup\u003e‡\u003c/sup\u003eProject Lead \u003csup\u003e✉\u003c/sup\u003eCorresponding Author\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://liyaowei-stu.github.io/project/BrushEdit/\"\u003e🌐Project Page\u003c/a\u003e |\n  \u003ca href=\"https://arxiv.org/abs/2412.10316\"\u003e📜Arxiv\u003c/a\u003e |\n  \u003ca href=\"https://www.youtube.com/watch?v=nDB7un9Rbdk\"\u003e📹Video\u003c/a\u003e |\n  \u003ca href=\"https://huggingface.co/spaces/TencentARC/BrushEdit\"\u003e🤗Hugging Face Demo\u003c/a\u003e |\n  \u003ca href=\"https://huggingface.co/TencentARC/BrushEdit\"\u003e🤗Hugging Model\u003c/a\u003e |\n\u003c/p\u003e\n\nhttps://github.com/user-attachments/assets/fde82f21-8b36-4584-8460-c109c195e614\n\n4K HD Introduction Video: [Youtube](https://www.youtube.com/watch?v=nDB7un9Rbdk).\n\n**📖 Table of Contents**\n\n- [BrushEdit](#brushedit)\n  - [TODO](#todo)\n  - [🛠️ Pipeline Overview](#️-pipeline-overview)\n  - [🚀 Getting Started](#-getting-started)\n    - [Environment Requirement 🌍](#environment-requirement-)\n    - [Download Checkpoints 💾](#download-checkpoints-)\n  - [🏃🏼 Running Scripts](#-running-scripts)\n    - [🤗 BrushEidt demo](#-brusheidt-demo)\n    - [👻 Demo Features](#-demo-features)\n  - [🤝🏼 Cite Us](#-cite-us)\n  - [💖 Acknowledgement](#-acknowledgement)\n  - [❓ Contact](#-contact)\n\n## TODO\n\n- [X] Release the code of BrushEdit. (MLLM-dirven Agent for Image Editing and Inpainting)\n- [X] Release the paper and webpage. More info: [BrushEdit](https://liyaowei-stu.github.io/project/BrushEdit/)\n- [X] Release the BrushNetX checkpoint(a more powerful BrushNet).\n- [X] Release gradio demo.\n\n## 🛠️ Pipeline Overview\n\nBrushEdit consists of four main steps: (i) Editing category classification: determine the type of editing required. (ii) Identification of the primary editing object: Identify the main object to be edited. (iii) Acquisition of the editing mask and target Caption: Generate the editing mask and corresponding target caption. (iv) Image inpainting: Perform the actual image editing. Steps (i) to (iii) utilize pre-trained MLLMs and detection models to ascertain the editing type, target object, editing masks, and target caption. Step (iv) involves image editing using the dual-branch inpainting model improved BrushNet. This model inpaints the target areas based on the target caption and editing masks, leveraging the generative potential and background preservation capabilities of inpainting models.\n\n![teaser](assets/brushedit_teaser.png)\n\n## 🚀 Getting Started\n\n### Environment Requirement 🌍\n\nBrushEdit has been implemented and tested on CUDA118, Pytorch 2.0.1, python 3.10.6.\n\nClone the repo:\n\n```\ngit clone https://github.com/TencentARC/BrushEdit.git\n```\n\nWe recommend you first use `conda` to create virtual environment, and install `pytorch` following [official instructions](https://pytorch.org/). For example:\n\n```\nconda create -n brushedit python=3.10.6 -y\nconda activate brushedit\npython -m pip install --upgrade pip\npip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118\n```\n\nThen, you can install diffusers (implemented in this repo) with:\n\n```\npip install -e .\n```\n\nAfter that, you can install required packages thourgh:\n\n```\npip install -r app/requirements.txt\n```\n\n### Download Checkpoints 💾\n\nCheckpoints of BrushEdit can be downloaded using the following command.\n\n```\nsh app/down_load_brushedit.sh\n```\n\n\n**The ckpt folder contains**\n\n- BrushNetX pretrained checkpoints for Stable Diffusion v1.5 (`brushnetX`)\n- Pretrained Stable Diffusion v1.5 checkpoint (e.g., realisticVisionV60B1_v51VAE from [Civitai](https://civitai.com/)). You can use `scripts/convert_original_stable_diffusion_to_diffusers.py` to process other models downloaded from Civitai.\n- Pretrained GroundingDINO checkpoint from [offical](https://huggingface.co/ShilongLiu/GroundingDINO/resolve/main/groundingdino_swint_ogc.pth).\n- Pretrained SAM checkpoint from [offical](https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth).\n\n\n\nThe checkpoint structure should be like:\n\n```\n|-- models\n    |-- base_model\n        |-- realisticVisionV60B1_v51VAE\n            |-- model_index.json\n            |-- vae\n            |-- ...\n        |-- dreamshaper_8\n            |-- ...\n        |-- epicrealism_naturalSinRC1VAE\n            |-- ...\n        |-- meinamix_meinaV11\n            |-- ...\n        |-- ...\n    |-- brushnetX\n        |-- config.json\n        |-- diffusion_pytorch_model.safetensors\n    |-- grounding_dino\n        |-- groundingdino_swint_ogc.pth\n    |-- sam\n        |-- sam_vit_h_4b8939.pth\n    |-- vlm\n        |-- llava-v1.6-mistral-7b-hf\n          |-- ...\n        |-- llava-v1.6-vicuna-13b-hf\n          |-- ...\n        |-- Qwen2-VL-7B-Instruct\n          |-- ...\n        |-- ...\n      \n```\n\nWe provide five base diffusion models, including:\n\n- Dreamshapre_8 is a versatile model that can generate impressive portraits and landscape images.\n- Epicrealism_naturalSinRC1VAE is a realistic style model that excels at generating portraits\n- HenmixReal_v5c is a model that specializes in generating realistic images of women.\n- Meinamix_meinaV11 is a model that excels at generating images in an animated style.\n- RealisticVisionV60B1_v51VAE is a highly generalized realistic style model. \n\nThe BrushNetX checkpoint represents an enhanced version of BrushNet, having been trained on a more diverse dataset to improve its editing capabilities, such as deletion and replacement.\n\nWe provide two VLM models, including  Qwen2-VL-7B-Instruct and LLama3-LLaa-next-8b-hf.  **We strongly recommend using GPT-4o for reasoning.**  After selecting the VLM model as gpt4-o, enter the API KEY and click the Submit and Verify button. If the output is success, you can use gpt4-o normally. Secondarily, we recommend using the Qwen2VL model.\n\nAnd you can download more prefromhuggingface_hubimporthf_hub_download, snapshot_downloadtrained VLMs model from [QwenVL](https://huggingface.co/collections/Qwen/qwen2-vl-66cee7455501d7126940800d) and [LLaVA-Next](https://huggingface.co/collections/llava-hf/llava-next-65f75c4afac77fd37dbbe6cf).\n\n\n## 🏃🏼 Running Scripts\n\n### 🤗 BrushEidt demo\n\nYou can run the demo using the script:\n\n```\nsh app/run_app.sh \n```\n\n### 👻 Demo Features\n\n\u003cimg src=\"assets/demo_vis.png\" alt=\"demo_vis\" width=\"auto\" height=\"500\"\u003e\n\n\n💡 \u003cb\u003eFundamental Features\u003c/b\u003e:\n\n\u003cul\u003e  \n    \u003cli\u003e 🎨 \u003cb\u003eAspect Ratio\u003c/b\u003e: Select the aspect ratio of the image. To prevent OOM, 1024px is the maximum resolution.\u003c/li\u003e\n    \u003cli\u003e 🎨 \u003cb\u003eVLM Model\u003c/b\u003e: Select the VLM model. We use preloaded models to save time. To use other VLM models, download them and uncomment the relevant lines in vlm_template.py from our GitHub repo. \u003c/li\u003e\n    \u003cli\u003e 🎨 \u003cb\u003eGenerate Mask\u003c/b\u003e: According to the input instructions, generate a mask for the area that may need to be edited. \u003c/li\u003e\n    \u003cli\u003e 🎨 \u003cb\u003eSquare/Circle Mask\u003c/b\u003e: Based on the existing mask, generate masks for squares and circles. (The coarse-grained mask provides more editing imagination.) \u003c/li\u003e\n    \u003cli\u003e 🎨 \u003cb\u003eInvert Mask\u003c/b\u003e: Invert the mask to generate a new mask. \u003c/li\u003e\n    \u003cli\u003e 🎨 \u003cb\u003eDilation/Erosion Mask\u003c/b\u003e: Expand or shrink the mask to include or exclude more areas. \u003c/li\u003e\n    \u003cli\u003e 🎨 \u003cb\u003eMove Mask\u003c/b\u003e: Move the mask to a new position. \u003c/li\u003e\n    \u003cli\u003e 🎨 \u003cb\u003eGenerate Target Prompt\u003c/b\u003e: Generate a target prompt based on the input instructions. \u003c/li\u003e\n    \u003cli\u003e 🎨 \u003cb\u003eTarget Prompt\u003c/b\u003e: Description for masking area, manual input or modification can be made when the content generated by VLM does not meet expectations. \u003c/li\u003e\n    \u003cli\u003e 🎨 \u003cb\u003eBlending\u003c/b\u003e: Blending brushnet's output and the original input, ensuring the original image details in the unedited areas. (turn off is beeter when removing.) \u003c/li\u003e\n    \u003cli\u003e 🎨 \u003cb\u003eControl length\u003c/b\u003e: The intensity of editing and inpainting. \u003c/li\u003e\n\u003c/ul\u003e\n\n💡 \u003cb\u003eAdvanced Features\u003c/b\u003e:\n\n\u003cul\u003e  \n    \u003cli\u003e 🎨 \u003cb\u003eBase Model\u003c/b\u003e: We use preloaded models to save time. To use other VLM models, download them and uncomment the relevant lines in vlm_template.py from our GitHub repo. \u003c/li\u003e\n    \u003cli\u003e 🎨 \u003cb\u003eBlending\u003c/b\u003e: Blending brushnet's output and the original input, ensuring the original image details in the unedited areas. (turn off is beeter when removing.) \u003c/li\u003e\n    \u003cli\u003e 🎨 \u003cb\u003eControl length\u003c/b\u003e: The intensity of editing and inpainting. \u003c/li\u003e\n    \u003cli\u003e 🎨 \u003cb\u003eNum samples\u003c/b\u003e: The number of samples to generate. \u003c/li\u003e\n    \u003cli\u003e 🎨 \u003cb\u003eNegative prompt\u003c/b\u003e: The negative prompt for the classifier-free guidance. \u003c/li\u003e\n    \u003cli\u003e 🎨 \u003cb\u003eGuidance scale\u003c/b\u003e: The guidance scale for the classifier-free guidance. \u003c/li\u003e\n\u003c/ul\u003e\n\n## 🤝🏼 Cite Us\n\n```\n@misc{li2024brushedit,\n  title={BrushEdit: All-In-One Image Inpainting and Editing}, \n  author={Yaowei Li and Yuxuan Bian and Xuan Ju and Zhaoyang Zhang and and Junhao Zhuang and Ying Shan and Yuexian Zou and Qiang Xu},\n  year={2024},\n  eprint={2412.10316},\n  archivePrefix={arXiv},\n  primaryClass={cs.CV}\n}\n\n\n```\n\n## 💖 Acknowledgement\nOur code is modified based on [diffusers](https://github.com/huggingface/diffusers) and [BrushNet](https://github.com/TencentARC/BrushNet) here, thanks to all the contributors!\n\n\n## ❓ Contact\nFor any question, feel free to email `liyaowei01@gmail.com`.\n\n## 🌟 Star History\n\u003cp align=\"center\"\u003e\n    \u003ca href=\"https://star-history.com/#TencentARC/BrushEdit\" target=\"_blank\"\u003e\n        \u003cimg width=\"500\" src=\"https://api.star-history.com/svg?repos=TencentARC/BrushEdit\u0026type=Date\" alt=\"Star History Chart\"\u003e\n    \u003c/a\u003e\n\u003cp\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftencentarc%2Fbrushedit","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftencentarc%2Fbrushedit","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftencentarc%2Fbrushedit/lists"}