{"id":13488676,"url":"https://github.com/ShihaoZhaoZSH/Uni-ControlNet","last_synced_at":"2025-03-28T01:37:13.050Z","repository":{"id":168452235,"uuid":"644169044","full_name":"ShihaoZhaoZSH/Uni-ControlNet","owner":"ShihaoZhaoZSH","description":"[NeurIPS 2023] Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models","archived":false,"fork":false,"pushed_at":"2024-07-17T02:47:57.000Z","size":19400,"stargazers_count":562,"open_issues_count":16,"forks_count":41,"subscribers_count":13,"default_branch":"main","last_synced_at":"2024-08-01T18:39:14.149Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ShihaoZhaoZSH.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-05-23T01:02:36.000Z","updated_at":"2024-07-31T14:37:58.000Z","dependencies_parsed_at":"2023-09-23T02:06:22.892Z","dependency_job_id":"34976854-d325-4456-b7a7-21c3d126a409","html_url":"https://github.com/ShihaoZhaoZSH/Uni-ControlNet","commit_stats":null,"previous_names":["shihaozhaozsh/uni-controlnet"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ShihaoZhaoZSH%2FUni-ControlNet","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ShihaoZhaoZSH%2FUni-ControlNet/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ShihaoZhaoZSH%2FUni-ControlNet/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ShihaoZhaoZSH%2FUni-ControlNet/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ShihaoZhaoZSH","download_url":"https://codeload.github.com/ShihaoZhaoZSH/Uni-ControlNet/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":222333976,"owners_count":16968058,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T18:01:19.921Z","updated_at":"2025-03-28T01:37:13.027Z","avatar_url":"https://github.com/ShihaoZhaoZSH.png","language":"Python","funding_links":[],"categories":["Additional conditions"],"sub_categories":[],"readme":"# [NeurIPS 2023] Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models\n\nOfficial implementation of Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models, which is accepted by NeurIPS 2023.\n\n### [Project Page](https://shihaozhaozsh.github.io/unicontrolnet/) | [Paper (ArXiv)](https://arxiv.org/abs/2305.16322) \n\u003cimg width=\"800\" alt=\"image\" src=\"./figs/results.png\"\u003e\n\n\n## ⏳ : To Do\n- [x] Release training code\n- [x] Release test code\n- [x] Release pre-trained models\n\n## 💡 : Method\n\u003cdiv align=\"center\"\u003e\n\u003cimg width=\"800\" alt=\"image\" src=\"./figs/pipeline.png\"\u003e\n\u003c/div\u003e\n\nUni-ControlNet is a novel controllable diffusion model that allows for the simultaneous utilization of different local controls and global controls in a flexible and composable manner within one model. This is achieved through the incorporation of two adapters - local control adapter and global control adapter, regardless of the number of local or global controls used. These two adapters can be trained separately without the need for joint training, while still supporting the composition of multiple control signals. \n\n\u003cdiv align=\"center\"\u003e\n\u003cimg width=\"600\" alt=\"image\" src=\"./figs/comparison.png\"\u003e\n\u003c/div\u003e\n\nHere are the comparisons of different controllable diffusion models. N is the number of conditions. Uni-ControlNet not only reduces the fine-tuning costs and model size as the number of the control conditions grows, but also facilitates composability of different conditions.\n\n## ⚙ : Setup\nFirst create a new conda environment\n\n    conda env create -f environment.yaml\n    conda activate unicontrol\n\nThen download the [pretrained model](https://drive.google.com/file/d/1lagkiWUYFYbgeMTuJLxutpTW0HFuBchd/view?usp=sharing) ([or here](https://huggingface.co/shihaozhao/uni-controlnet/blob/main/uni.ckpt)) and put it to `./ckpt/` folder. The model is built upon Stable Diffusion v1.5.\n\n## 💻 : Test\nYou can launch the gradio demo by:\n\n    python src/test/test.py\n    \nThis command will load the downloaded pretrained weights and start the App. We include seven example local conditions: Canny edge, MLSD edge, HED boundary, sketch, Openpose, Midas depth, segmentation mask, and one example global condition: content. \n\n\u003cdiv align=\"center\"\u003e\n\u003cimg width=\"800\" alt=\"image\" src=\"./figs/demo_conditions.png\"\u003e\n\u003c/div\u003e\n\nYou can first upload a source image and our code automatically detects its sketch. Then Uni-ControlNet generates samples following the sketch and the text prompt which in this example is \"Robot spider, mars\". The results are shown at the bottom of the demo page, with generated images in the upper part and detected conditions in the lower part:\n\n\u003cdiv align=\"center\"\u003e\n\u003cimg width=\"800\" alt=\"image\" src=\"./figs/demo_results.png\"\u003e\n\u003c/div\u003e\n\nYou can further detail your configuration in the panel：\n\n\u003cdiv align=\"center\"\u003e\n\u003cimg width=\"800\" alt=\"image\" src=\"./figs/demo_panel.png\"\u003e\n\u003c/div\u003e\n\nUni-ControlNet also handles multi-conditions setting well. Here is an example of the combination of two local conditions: Canny edge of the Stormtrooper and the depth map of a forest. The prompt is set to \"Stormtrooper's lecture in the forest\" and here are the results:\n\n\u003cdiv align=\"center\"\u003e\n\u003cimg width=\"800\" alt=\"image\" src=\"./figs/demo_results2.png\"\u003e\n\u003c/div\u003e\n\nWith Uni-ControlNet, you can go even further and incorporate more conditions. For instance, you can provide the local conditions of a deer, a sofa, a forest, and the global condition of snow to create a scene that is unlikely to occur naturally. The prompt is set to \"A sofa and a deer in the forest\" and here are the results.\n\n\u003cdiv align=\"center\"\u003e\n\u003cimg width=\"800\" alt=\"image\" src=\"./figs/demo_results3.png\"\u003e\n\u003c/div\u003e\n\n## ☕️ : Training\n\nYou should first download the pretrained weights of [Stable Diffusion](https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/v1-5-pruned.ckpt) and put it to `./ckpt/` folder. Then, you can get the initial weights for training by:\n\n    python utils/prepare_weights.py init_local ckpt/v1-5-pruned.ckpt configs/local_v15.yaml ckpt/init_local.ckpt\n    \n    python utils/prepare_weights.py init_global ckpt/v1-5-pruned.ckpt configs/global_v15.yaml ckpt/init_global.ckpt\n\nThe 4 arguments are mode, pretrained SD weights, model configs and output path for the initial weights.\n\nTo prepare the training data, please ensure that they are placed in the `./data/` folder and organized in the following manner:\n\n```\ndata/\n├── anno.txt\n├── images/\n├── conditions/\n    ├── condition-1/\n    ├── condition-2/\n    ...\n...\n```\n\nSpecifically, you can utilize the condition detectors in `./annotator/` to extract the conditions. Then, you have to put the original images into `./data/images/` folder and the extracted conditions into `./data/conditions/condition-N/` folder. And `./data/anno.txt` is the annotation file, where each line represents a training sample and is divided into two parts: 1) file ID and 2) annotation. Please ensure the consistency between the file IDs in `./data/anno.txt`， `./data/images/` and `./data/conditions/condition-N/` directories.\n\nNow, you can train with you own data simply by:\n\n    python src/train/train.py\n\nKindly note that the local adapter and global adapter must be trained separately. Additionally, you can customize the training configurations in `./src/train/train.py` and `./configs/`. \n\nOnce you have completed separate training, you will need to integrate the two adapters by:\n\n    python utils/prepare_weights.py integrate path1 path2 configs/uni_v15.yaml path3\n\nPath1 and path2 refer to the trained weights of SD with local and global adapters, respectively, while path3 denotes the output path for Uni-ControlNet.\n\n## 🎉 : Acknowledgments:\n\nThis repo is built upon [ControlNet](https://github.com/lllyasviel/ControlNet/tree/main) and really thank to their great work!\n\n## 📖 : Citation\n\n```bibtex\n@article{zhao2023uni,\n  title={Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models},\n  author={Zhao, Shihao and Chen, Dongdong and Chen, Yen-Chun and Bao, Jianmin and Hao, Shaozhe and Yuan, Lu and Wong, Kwan-Yee~K.},\n  journal={Advances in Neural Information Processing Systems},\n  year={2023}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FShihaoZhaoZSH%2FUni-ControlNet","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FShihaoZhaoZSH%2FUni-ControlNet","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FShihaoZhaoZSH%2FUni-ControlNet/lists"}