{"id":13488454,"url":"https://github.com/microsoft/ReCo","last_synced_at":"2025-03-28T01:35:30.787Z","repository":{"id":175444891,"uuid":"626159932","full_name":"microsoft/ReCo","owner":"microsoft","description":"ReCo: Region-Controlled Text-to-Image Generation, CVPR 2023","archived":false,"fork":false,"pushed_at":"2023-11-08T00:48:30.000Z","size":3680,"stargazers_count":112,"open_issues_count":11,"forks_count":8,"subscribers_count":5,"default_branch":"main","last_synced_at":"2024-08-01T18:37:38.042Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/microsoft.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":"SUPPORT.md","governance":null,"roadmap":null,"authors":null}},"created_at":"2023-04-10T23:25:54.000Z","updated_at":"2024-07-23T15:50:41.000Z","dependencies_parsed_at":"2024-01-16T09:02:58.883Z","dependency_job_id":"0d782b9c-c09b-4045-8a7a-82808300a31f","html_url":"https://github.com/microsoft/ReCo","commit_stats":null,"previous_names":["microsoft/reco"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/microsoft%2FReCo","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/microsoft%2FReCo/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/microsoft%2FReCo/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/microsoft%2FReCo/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/microsoft","download_url":"https://codeload.github.com/microsoft/ReCo/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":222333975,"owners_count":16968058,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T18:01:16.093Z","updated_at":"2025-03-28T01:35:30.767Z","avatar_url":"https://github.com/microsoft.png","language":"Jupyter Notebook","funding_links":[],"categories":["Spatial Control"],"sub_categories":[],"readme":"# ReCo: Region-Controlled Text-to-Image Generation\n[ReCo: Region-Controlled Text-to-Image Generation](https://arxiv.org/pdf/2211.15518.pdf)\n\nby [Zhengyuan Yang](https://zhengyuan.info), [Jianfeng Wang](http://jianfengwang.me/), [Zhe Gan](https://zhegan27.github.io/), [Linjie Li](https://www.microsoft.com/en-us/research/people/linjli/), [Kevin Lin](https://sites.google.com/site/kevinlin311tw/), [Chenfei Wu](https://chenfei-wu.github.io/), [Nan Duan](https://nanduan.github.io/), [Zicheng Liu](https://zicliu.wixsite.com/mysite), [Ce Liu](http://people.csail.mit.edu/celiu/), [Michael Zeng](https://www.microsoft.com/en-us/research/people/nzeng/), [Lijuan Wang](https://www.microsoft.com/en-us/research/people/lijuanw/)\n\nIEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023.\n\n**\\*\\*\\*\\*\\* Update: ReCo is now available at Huggingface Diffusers! [COCO Model](https://huggingface.co/j-min/reco_sd14_coco) [LAION Model](https://huggingface.co/j-min/reco_sd14_laion). \n\nCredits to [Jaemin Cho](https://j-min.io/). Thank you! \\*\\*\\*\\*\\***\n\n\n### Introduction\nReCo extends T2I models to understand coordinate inputs. Thanks to the introduced position tokens in the region-controlled input query, users can easily specify free-form regional descriptions in arbitrary image regions.\nFor more details, please refer to our\n[paper](https://arxiv.org/pdf/2211.15518.pdf).\n\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://zyang-ur.github.io//reco/reco.png\" width=\"100%\"/\u003e\n\u003c/p\u003e\n\n### Citation\n\n    @inproceedings{yang2023reco,\n      title={ReCo: Region-Controlled Text-to-Image Generation},\n      author={Yang, Zhengyuan and Wang, Jianfeng and Gan, Zhe and Li, Linjie and Lin, Kevin and Wu, Chenfei and Duan, Nan and Liu, Zicheng and Liu, Ce and Zeng, Michael and Wang, Lijuan},\n      booktitle={CVPR},\n      year={2023}\n    }\n\n\n## Installation\nClone the repository:\n```\ngit clone https://github.com/microsoft/ReCo.git\ncd ReCo\n```\n\nA [conda](https://conda.io/) environment named `reco_env` can be created\nand activated with:\n\n```\nconda env create -f environment.yaml\nconda activate reco_env\n```\n\nOr install packages in ``requirements.txt``:\n\n```\npip install -r requirements.txt\n```\n\n### AzCopy\nWe recommend using the following AzCopy command to download.\nAzCopy executable tools can be [downloaded here](https://docs.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-v10#download-azcopy).\n\nExample command:\n```\npath/to/azcopy copy \u003cfolder-link\u003e \u003ctarget-address\u003e --resursive\"\n\n# For example:\npath/to/azcopy copy https://unitab.blob.core.windows.net/data/reco/dataset \u003clocal_path\u003e --recursive\n```\n\n## Data\nDownload processed dataset annotations ```dataset``` folder in the following dataset path (~59G) with [azcopy tool](https://learn.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-v10#download-azcopy).\n```\npath/to/azcopy copy https://unitab.blob.core.windows.net/data/reco/dataset \u003clocal_path\u003e --recursive\n```\n\n\n## Inference and Checkpoints\nReCo checkpoints trained on COCO and a small LAION subset can be downloaded via wget or [AzCopy](https://learn.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-v10#download-azcopy) here [ReCo_COCO](https://unitab.blob.core.windows.net/data/reco/reco_coco_616.ckpt) and [ReCo_LAION](https://unitab.blob.core.windows.net/data/reco/reco_laion_1232.ckpt). Save downloaded weights to ```logs```.\n\n```inference.sh``` contains examples for inference calls\n\n```eval.sh``` contains examples for coco evaluation.\n\n## Fine-tuning\nFor ReCo fine-tuning, we start with the stable diffusion model with [instructions here](https://github.com/CompVis/stable-diffusion#stable-diffusion-v1). Weights can be downloaded on [HuggingFace](https://huggingface.co/CompVis). The experiments mainly use ```sd-v1-4-full-ema.ckpt```.\n\n```train.sh``` contains examples for fine-tuning.\n\n\n## Acknowledgement\nThe project is built based on the following repository:\n* [CompVis/stable-diffusion](https://github.com/CompVis/stable-diffusion),\n* [XavierXiao/Dreambooth-Stable-Diffusion](https://github.com/XavierXiao/Dreambooth-Stable-Diffusion),\n* [harubaru/waifu-diffusion: stable diffusion finetuned on danbooru](https://github.com/harubaru/waifu-diffusion).\n\n### Contributing\n\nThis project welcomes contributions and suggestions.  Most contributions require you to agree to a\nContributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us\nthe rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.\n\nWhen you submit a pull request, a CLA bot will automatically determine whether you need to provide\na CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions\nprovided by the bot. You will only need to do this once across all repos using our CLA.\n\nThis project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).\nFor more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or\ncontact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.\n\n### Trademarks\n\nThis project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft \ntrademarks or logos is subject to and must follow \n[Microsoft's Trademark \u0026 Brand Guidelines](https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks/usage/general).\nUse of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship.\nAny use of third-party trademarks or logos are subject to those third-party's policies.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmicrosoft%2FReCo","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmicrosoft%2FReCo","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmicrosoft%2FReCo/lists"}