Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/aimagelab/dress-code

Dress Code: High-Resolution Multi-Category Virtual Try-On. ECCV 2022
https://github.com/aimagelab/dress-code

artificial-intelligence computer-vision deep-learning dress-code eccv2022 virtual-try-on

Last synced: 3 months ago
JSON representation

Dress Code: High-Resolution Multi-Category Virtual Try-On. ECCV 2022

Awesome Lists containing this project

README

        

# Dress Code Dataset

This repository presents the virtual try-on dataset proposed in:

*D. Morelli, M. Fincato, M. Cornia, F. Landi, F. Cesari, R. Cucchiara*
**Dress Code: High-Resolution Multi-Category Virtual Try-On**

**[[Paper](https://arxiv.org/abs/2204.08532)]** **[[Dataset Request Form](https://forms.gle/72Bpeh48P7zQimin7)]** **[[Try-On Demo](https://ailb-web.ing.unimore.it/dress-code)]**

**IMPORTANT!**
- By making any use of the Dress Code Dataset, you accept and agree to comply with the terms and conditions reported [here](https://github.com/aimagelab/dress-code/blob/main/LICENCE).
- The dataset will not be released to private companies.
- When filling the dataset request form, non-institutional emails (e.g. gmail.com) are not allowed.
- The signed release agreement form is mandatory (see the dataset request form for more details). Incomplete or unsigned release agreement form are not accepted and will not receive a response. Typed signature are not allowed.

Please cite with the following BibTeX:

```
@inproceedings{morelli2022dresscode,
title={{Dress Code: High-Resolution Multi-Category Virtual Try-On}},
author={Morelli, Davide and Fincato, Matteo and Cornia, Marcella and Landi, Federico and Cesari, Fabio and Cucchiara, Rita},
booktitle={Proceedings of the European Conference on Computer Vision},
year={2022}
}
```



## Dataset

We collected a new dataset for image-based virtual try-on composed of image pairs coming from different catalogs of YOOX NET-A-PORTER.
The dataset contains more than 50k high resolution model clothing images pairs divided into three different categories (*i.e.* dresses, upper-body clothes, lower-body clothes).



### Summary
- 53792 garments
- 107584 images
- 3 categories
- upper body
- lower body
- dresses
- 1024 x 768 image resolution
- additional info
- keypoints
- skeletons
- human label maps
- human dense poses

### Additional Info
Along with model and garment image pair, we provide also the keypoints, skeleton, human label map, and dense pose.



More info

### Keypoints
For all image pairs of the dataset, we stored the joint coordinates of human poses.
In particular, we used [OpenPose](https://github.com/Hzzone/pytorch-openpose) [1] to extract 18 keypoints for each human body.

For each image, we provided a json file containing a dictionary with the `keypoints` key.
The value of this key is a list of 18 elements, representing the joints of the human body. Each element is a list of 4 values, where the first two indicate the coordinates on the x and y axis respectively.

### Skeletons
Skeletons are RGB images obtained connecting keypoints with lines.

### Human Label Map

We employed a human parser to assign each pixel of the image to a specific category thus obtaining a segmentation mask for each target model.
Specifically, we used the [SCHP model](https://github.com/PeikeLi/Self-Correction-Human-Parsing) [2] trained on the ATR dataset, a large single person human parsing dataset focused on fashion images with 18 classes.

Obtained images are composed of 1 channel filled with the category label value.
Categories are mapped as follows:

```ruby
0 background
1 hat
2 hair
3 sunglasses
4 upper_clothes
5 skirt
6 pants
7 dress
8 belt
9 left_shoe
10 right_shoe
11 head
12 left_leg
13 right_leg
14 left_arm
15 right_arm
16 bag
17 scarf
```

### Human Dense Pose

We also extracted dense label and UV mapping from all the model images using [DensePose](https://github.com/facebookresearch/detectron2/tree/main/projects/DensePose) [3].

## Experimental Results

### Low Resolution 256 x 192


Name
SSIM
FID
KID


CP-VTON [4]
0.803
35.16
2.245



CP-VTON+ [5]
0.902
25.19
1.586



CP-VTON* [4]
0.874
18.99
1.117



PFAFN [6]
0.902
14.38
0.743



VITON-GT [7]
0.899
13.80
0.711



WUTON [8]
0.902
13.28
0.771



ACGPN [9]
0.868
13.79
0.818



OURS
0.906
11.40
0.570

## Code
Due to a firm collaboration, we cannot release the code. However, we supply an empty Pytorch project to load data.
## References

[1] Cao, et al. "OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields." IEEE TPAMI, 2019.

[2] Li, et al. "Self-Correction for Human Parsing." arXiv, 2019.

[3] Güler, et al. "Densepose: Dense human pose estimation in the wild." CVPR, 2018.

[4] Wang, et al. "Toward Characteristic-Preserving Image-based Virtual Try-On Network." ECCV, 2018.

[5] Minar, et al. "CP-VTON+: Clothing Shape and Texture Preserving Image-Based Virtual Try-On." CVPR Workshops, 2020.

[6] Ge, et al. "Parser-Free Virtual Try-On via Distilling Appearance Flows." CVPR, 2021.

[7] Fincato, et al. "VITON-GT: An Image-based Virtual Try-On Model with Geometric Transformations." ICPR, 2020.

[8] Issenhuth, el al. "Do Not Mask What You Do Not Need to Mask: a Parser-Free Virtual Try-On." ECCV, 2020.

[9] Yang, et al. "Towards Photo-Realistic Virtual Try-On by Adaptively Generating-Preserving Image Content." CVPR, 2020.

## Contact

If you have any general doubt about our dataset, please use the [public issues section](https://github.com/aimagelab/dress-code/issues) on this github repo. Alternatively, drop us an e-mail at davide.morelli [at] unimore.it or marcella.cornia [at] unimore.it.