{"id":13487786,"url":"https://github.com/yisol/IDM-VTON","last_synced_at":"2025-03-27T23:31:39.625Z","repository":{"id":230130880,"uuid":"774682866","full_name":"yisol/IDM-VTON","owner":"yisol","description":"[ECCV2024] IDM-VTON : Improving Diffusion Models for Authentic Virtual Try-on in the Wild","archived":false,"fork":false,"pushed_at":"2024-07-30T04:06:47.000Z","size":22468,"stargazers_count":3840,"open_issues_count":116,"forks_count":600,"subscribers_count":54,"default_branch":"main","last_synced_at":"2024-10-29T15:34:12.759Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://idm-vton.github.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/yisol.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-03-20T01:29:00.000Z","updated_at":"2024-10-29T14:12:50.000Z","dependencies_parsed_at":"2024-11-07T04:34:57.277Z","dependency_job_id":null,"html_url":"https://github.com/yisol/IDM-VTON","commit_stats":null,"previous_names":["yisol/idm-vton"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yisol%2FIDM-VTON","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yisol%2FIDM-VTON/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yisol%2FIDM-VTON/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yisol%2FIDM-VTON/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/yisol","download_url":"https://codeload.github.com/yisol/IDM-VTON/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245767336,"owners_count":20668823,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T18:01:03.819Z","updated_at":"2025-03-27T23:31:39.600Z","avatar_url":"https://github.com/yisol.png","language":"Python","readme":"\n\u003cdiv align=\"center\"\u003e\n\u003ch1\u003eIDM-VTON: Improving Diffusion Models for Authentic Virtual Try-on in the Wild\u003c/h1\u003e\n\n\u003ca href='https://idm-vton.github.io'\u003e\u003cimg src='https://img.shields.io/badge/Project-Page-green'\u003e\u003c/a\u003e\n\u003ca href='https://arxiv.org/abs/2403.05139'\u003e\u003cimg src='https://img.shields.io/badge/Paper-Arxiv-red'\u003e\u003c/a\u003e\n\u003ca href='https://huggingface.co/spaces/yisol/IDM-VTON'\u003e\u003cimg src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Demo-yellow'\u003e\u003c/a\u003e\n\u003ca href='https://huggingface.co/yisol/IDM-VTON'\u003e\u003cimg src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Model-blue'\u003e\u003c/a\u003e\n\n\n\u003c/div\u003e\n\nThis is the official implementation of the paper [\"Improving Diffusion Models for Authentic Virtual Try-on in the Wild\"](https://arxiv.org/abs/2403.05139).\n\nStar ⭐ us if you like it!\n\n---\n\n\n![teaser2](assets/teaser2.png)\u0026nbsp;\n![teaser](assets/teaser.png)\u0026nbsp;\n\n\n\n## Requirements\n\n```\ngit clone https://github.com/yisol/IDM-VTON.git\ncd IDM-VTON\n\nconda env create -f environment.yaml\nconda activate idm\n```\n\n## Data preparation\n\n### VITON-HD\nYou can download VITON-HD dataset from [VITON-HD](https://github.com/shadow2496/VITON-HD).\n\nAfter download VITON-HD dataset, move vitonhd_test_tagged.json into the test folder, and move vitonhd_train_tagged.json into the train folder.\n\nStructure of the Dataset directory should be as follows.\n\n```\n\ntrain\n|-- image\n|-- image-densepose\n|-- agnostic-mask\n|-- cloth\n|-- vitonhd_train_tagged.json\n\ntest\n|-- image\n|-- image-densepose\n|-- agnostic-mask\n|-- cloth\n|-- vitonhd_test_tagged.json\n\n```\n\n### DressCode\nYou can download DressCode dataset from [DressCode](https://github.com/aimagelab/dress-code).\n\nWe provide pre-computed densepose images and captions for garments [here](https://kaistackr-my.sharepoint.com/:u:/g/personal/cpis7_kaist_ac_kr/EaIPRG-aiRRIopz9i002FOwBDa-0-BHUKVZ7Ia5yAVVG3A?e=YxkAip).\n\nWe used [detectron2](https://github.com/facebookresearch/detectron2) for obtaining densepose images, refer [here](https://github.com/sangyun884/HR-VITON/issues/45) for more details.\n\nAfter download the DressCode dataset, place image-densepose directories and caption text files as follows.\n\n```\nDressCode\n|-- dresses\n    |-- images\n    |-- image-densepose\n    |-- dc_caption.txt\n    |-- ...\n|-- lower_body\n    |-- images\n    |-- image-densepose\n    |-- dc_caption.txt\n    |-- ...\n|-- upper_body\n    |-- images\n    |-- image-densepose\n    |-- dc_caption.txt\n    |-- ...\n```\n\n\n## Training\n\n\n### Preparation\n\nDownload pre-trained ip-adapter for sdxl(IP-Adapter/sdxl_models/ip-adapter-plus_sdxl_vit-h.bin) and image encoder(IP-Adapter/models/image_encoder) [here](https://github.com/tencent-ailab/IP-Adapter).\n\n```\ngit clone https://huggingface.co/h94/IP-Adapter\n```\n\nMove ip-adapter to ckpt/ip_adapter, and image encoder to ckpt/image_encoder.\n\nStart training using python file with arguments,\n\n```\naccelerate launch train_xl.py \\\n    --gradient_checkpointing --use_8bit_adam \\\n    --output_dir=result --train_batch_size=6 \\\n    --data_dir=DATA_DIR\n```\n\nor, you can simply run with the script file.\n\n```\nsh train_xl.sh\n```\n\n\n## Inference\n\n\n### VITON-HD\n\nInference using python file with arguments,\n\n```\naccelerate launch inference.py \\\n    --width 768 --height 1024 --num_inference_steps 30 \\\n    --output_dir \"result\" \\\n    --unpaired \\\n    --data_dir \"DATA_DIR\" \\\n    --seed 42 \\\n    --test_batch_size 2 \\\n    --guidance_scale 2.0\n```\n\nor, you can simply run with the script file.\n\n```\nsh inference.sh\n```\n\n### DressCode\n\nFor DressCode dataset, put the category you want to generate images via category argument,\n```\naccelerate launch inference_dc.py \\\n    --width 768 --height 1024 --num_inference_steps 30 \\\n    --output_dir \"result\" \\\n    --unpaired \\\n    --data_dir \"DATA_DIR\" \\\n    --seed 42 \n    --test_batch_size 2\n    --guidance_scale 2.0\n    --category \"upper_body\" \n```\n\nor, you can simply run with the script file.\n```\nsh inference.sh\n```\n\n## Start a local gradio demo \u003ca href='https://github.com/gradio-app/gradio'\u003e\u003cimg src='https://img.shields.io/github/stars/gradio-app/gradio'\u003e\u003c/a\u003e\n\nDownload checkpoints for human parsing [here](https://huggingface.co/spaces/yisol/IDM-VTON/tree/main/ckpt).\n\nPlace the checkpoints under the ckpt folder.\n```\nckpt\n|-- densepose\n    |-- model_final_162be9.pkl\n|-- humanparsing\n    |-- parsing_atr.onnx\n    |-- parsing_lip.onnx\n\n|-- openpose\n    |-- ckpts\n        |-- body_pose_model.pth\n    \n```\n\n\n\n\nRun the following command:\n\n```python\npython gradio_demo/app.py\n```\n\n\n\n\n\n\n## Acknowledgements\n\n\nThanks [ZeroGPU](https://huggingface.co/zero-gpu-explorers) for providing free GPU.\n\nThanks [IP-Adapter](https://github.com/tencent-ailab/IP-Adapter) for base codes.\n\nThanks [OOTDiffusion](https://github.com/levihsu/OOTDiffusion) and [DCI-VTON](https://github.com/bcmi/DCI-VTON-Virtual-Try-On) for masking generation.\n\nThanks [SCHP](https://github.com/GoGoDuck912/Self-Correction-Human-Parsing) for human segmentation.\n\nThanks [Densepose](https://github.com/facebookresearch/DensePose) for human densepose.\n\n\n\n## Star History\n\n[![Star History Chart](https://api.star-history.com/svg?repos=yisol/IDM-VTON\u0026type=Date)](https://star-history.com/#yisol/IDM-VTON\u0026Date)\n\n\n\n## Citation\n```\n@article{choi2024improving,\n  title={Improving Diffusion Models for Authentic Virtual Try-on in the Wild},\n  author={Choi, Yisol and Kwak, Sangkyung and Lee, Kyungmin and Choi, Hyungwon and Shin, Jinwoo},\n  journal={arXiv preprint arXiv:2403.05139},\n  year={2024}\n}\n```\n\n\n\n## License\nThe codes and checkpoints in this repository are under the [CC BY-NC-SA 4.0 license](https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode).\n\n\n","funding_links":[],"categories":["Python","Repos","Tools \u0026 Frameworks","Personalized Restoration","人像_姿势_3D人脸"],"sub_categories":["Clothing(Visual Try on)","资源传输下载"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyisol%2FIDM-VTON","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyisol%2FIDM-VTON","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyisol%2FIDM-VTON/lists"}