Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/ziqihuangg/CelebA-Dialog

A large-scale visual-language face dataset with fine-grained annotations (ICCV 2021)
https://github.com/ziqihuangg/CelebA-Dialog

dataset iccv2021

Last synced: 2 months ago
JSON representation

A large-scale visual-language face dataset with fine-grained annotations (ICCV 2021)

Awesome Lists containing this project

README

        

# CelebA-Dialog Dataset

**Talk-to-Edit: Fine-Grained Facial Editing via Dialog**
[Yuming Jiang*](https://yumingj.github.io/),
[Ziqi Huang*](https://ziqihuangg.github.io/),
[Xingang Pan](https://xingangpan.github.io/),
[Chen Change Loy](https://www.mmlab-ntu.com/person/ccloy/)
and
[Ziwei Liu](https://liuziwei7.github.io/)
In IEEE International Conference on Computer Vision (**ICCV**), 2021.

From [MMLab@NTU](https://www.mmlab-ntu.com/index.html) affliated with S-Lab, Nanyang Technological University.

[**[Project Page]**](https://www.mmlab-ntu.com/project/talkedit/) | [**[Paper]**](https://arxiv.org/abs/2109.04425) | [**[Code]**](https://github.com/yumingj/Talk-to-Edit) | [**[Video]**](https://www.youtube.com/watch?v=ZKMkQhkMXPI) | [**[Web Page]**](https://mmlab.ie.cuhk.edu.hk/projects/CelebA/CelebA_Dialog.html)

**CelebA-Dialog** is a large-scale visual-language face dataset with the following features:
- Facial images are annotated with rich **fine-grained labels**, which classify one attribute into multiple degrees according to its semantic meaning.
- Accompanied with each image, there are **captions** describing the attributes and a **user request** sample.

The dataset can be employed as the training and test sets for the following computer vision tasks: fine-grained facial attribute recognition, fine-grained facial manipulation, text-based facial generation and manipulation, face image captioning, natural language based facial recognition and manipulation, and broader multi-modality learning tasks.
The dataset is proposed in [Talk-to-Edit](https://github.com/yumingj/Talk-to-Edit).

## Download Links
You can download using the following links:
* "**HQ**" refers to images and corresponding annotations for the 30,000 high-resolutions images following CelebA-HQ.
* "**standard**" refers to images and corresponding annotations for original 202,599 CelebA images.

| Link (HQ) | Size | Files | Format | Description
| :--- | :--- | :---: | :----: | :----------
| [CelebA-Dialog (HQ)](https://drive.google.com/drive/folders/1YRRaC3LWLHorVhFNJPzVqLrUlA10eLEJ?usp=sharing) | ~4.4 GB | | | 30,000 high-resolution images and corresponding annotations
| ├  [image (HQ)](https://drive.google.com/file/d/1A2dNWabg6_um-V3lhw1tyead5hCpjaW8/view?usp=sharing) | ~2.7 GB | 30,000 | JPG | images from CelebA-HQ
| ├  [fine-grained label (HQ)](https://drive.google.com/file/d/1oscEGdTfvBqohlagtp9dfgduGOfgXZxx/view?usp=sharing) | ~600 KB | 1 | TXT | fine-grained labels for 5 attributes
| ├  [binary label (HQ)](https://drive.google.com/file/d/1QvfDVRW7W3-MOCro1EPdnhuXnQ1STOGG/view?usp=sharing) | ~3.5 MB | 1 | TXT | binary labels for 40 attributes
| ├  [text (HQ)](https://drive.google.com/drive/folders/1CzTZm8suzDWdoN6DQmv11tsZotYo1Yfu?usp=sharing) | ~27 MB | 4 | TXT and JSON | natural language captions and editing requests
| ├  [mask (HQ)](https://drive.google.com/drive/folders/1bRZmrUBz8y0ObTr8AlkbVfyUco5R2I0z?usp=sharing) | ~1.8 GB | | PNG | segmentation masks (1) [binary](https://drive.google.com/file/d/1MUYHw-IGP5FHy0yJzgvXNZojlcnCq7IE/view?usp=sharing) (2) [colorized](https://drive.google.com/file/d/1q2DWtGA1h4NcS1Az4OX-5sbLXsGWJZWq/view?usp=sharing)
| ├  [identity (HQ)](https://drive.google.com/file/d/1yd0bfYkcQ9_BqIxhjyjZ6UHgNVg7n63V/view?usp=sharing) | ~400 KB | 1 | TXT | identity label of each image

| Link (standard) | Size | Files | Format | Description
| :--- | :--- | :---: | :----: | :----------
| [CelebA-Dialog (standard)](https://drive.google.com/drive/folders/18nejI_hrwNzWyoF6SW8bL27EYnM4STAs?usp=sharing) | | | | 202,599 original CelebA images and corresponding annotations
| ├  [image (standard)](https://drive.google.com/drive/folders/0B7EVK8r0v71pTUZsaXdaSnZBZzg?resourcekey=0-rJlzl934LzC-Xp28GeIBzQ&usp=sharing) | | | | images from CelebA
| ├  [fine-grained label (standard)](https://drive.google.com/file/d/1wZcVEjJ5LwP1Ciuc3j_RFw9Vcusj4UEU/view?usp=sharing) | ~4 MB | 1 | TXT | fine-grained labels for 5 attributes
| ├  [binary label (standard)](https://drive.google.com/file/d/0B7EVK8r0v71pblRyaVFSWGxPY0U/view?usp=sharing&resourcekey=0-YW2qIuRcWHy_1C2VaRGL3Q) | ~25 MB | 1 | TXT | binary labels for 40 attributes
| ├  [text (standard)](https://drive.google.com/drive/folders/18nejI_hrwNzWyoF6SW8bL27EYnM4STAs?usp=sharing) | ~14 MB | | TXT and JSON | natural language captions and editing requests
| ├  [identity (standard)](https://drive.google.com/file/d/1_ee_0u7vcNLOfNLegJRHmolfH5ICW-XS/view?usp=sharing) | ~3.3 MB | 1 | TXT | identity label of each image

| Link (mapping) | Size | Files | Format | Description
| :--- | :--- | :---: | :----: | :----------
| [HQ-to-standard mapping](https://drive.google.com/file/d/10msPsx1Fouh5h8m8LoSaPTh9R4dv_mLG/view?usp=sharing) | ~1 MB | 1 | TXT | The mapping between 30,000 CelebA-HQ images and the 202,599 CelebA images

## Details

### Image

* **HQ**:
* 30,000 face images selected from the CelebA dataset by following CelebA-HQ
* High resolution of 1024 x 1024
* **standard**:
* 202,599 face images from the CelebA dataset

### Fine-Grained Label

* 5 fine-grained attributes annotations per image: Bangs, Eyeglasses, Beard, Smiling, and Age

### Binary Label

* 40 binary attributes annotations per image

### Text

* Textual captions for each image
* A user editing request per image

### Mask

We preprocess the facial segmentation masks of [CelebAMask-HQ](https://mmlab.ie.cuhk.edu.hk/projects/CelebA/CelebAMask_HQ.html) to ease future research.
* You can directly download the ***binary masks for individual labels*** for each image. These are the same as the ones provided in CelebAMask-HQ. ([Download link](https://drive.google.com/file/d/1MUYHw-IGP5FHy0yJzgvXNZojlcnCq7IE/view?usp=sharing))
* We produce the ***combined colorized mask*** for each image following the parsing of CelebAMask-HQ. ([Download link](https://drive.google.com/file/d/1q2DWtGA1h4NcS1Az4OX-5sbLXsGWJZWq/view?usp=sharing))

Below is the color-to-label parsing information:

| Label list | | | | |
| ------------ | ------------- | ------------ | ------------- | ------------ |
| 0: 'background' | 1: 'skin' | 2: 'nose' | 3: 'eye_g' | 4: 'l_eye' |
| 5: 'r_eye' | 6: 'l_brow' | 7: 'r_brow' | 8: 'l_ear' | 9: 'r_ear' |
| 10: 'mouth' | 11: 'u_lip' | 12: 'l_lip' | 13: 'hair' | 14: 'hat' |
| 15: 'ear_r' | 16: 'neck_l' | 17: 'neck' | 18: 'cloth' | | |

```python
from PIL import Image
import numpy as np

segm = Image.open(f)
segm = np.array(segm) # shape: [512, 512]
```

### Identity
Some images are of the same person. There are totally 10,177 identities in the dataset. On average, there are:
* around 20 images per identity in CelebA (standard)
* around 3 images per identity in CelebA-HQ

## Agreement
* The CelebA-Dialog dataset is available for non-commercial research purposes only.
* You agree not to reproduce, duplicate, copy, sell, trade, resell or exploit for any commercial purposes, any portion of the images and any portion of derived data.
* You agree not to further copy, publish or distribute any portion of the CelebA-Dialog dataset. Except, for internal use at a single site within the same organization it is allowed to make copies of the dataset.

## Citation

If you find this dataset useful for your research and use it in your work, please consider cite the following papers:

```bibtex
@InProceedings{CelebA-Dialog,
title = {Talk-to-Edit: Fine-Grained Facial Editing via Dialog},
author = {Jiang, Yuming and Huang, Ziqi and Pan, Xingang and Loy, Chen Change and Liu, Ziwei},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision},
year={2021}
}

@inproceedings{CelebAMask-HQ,
title = {MaskGAN: Towards Diverse and Interactive Facial Image Manipulation},
author = {Lee, Cheng-Han and Liu, Ziwei and Wu, Lingyun and Luo, Ping},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2020}
}

@inproceedings{CelebA-HQ,
title={Progressive Growing of {GAN}s for Improved Quality, Stability, and Variation},
author={Tero Karras and Timo Aila and Samuli Laine and Jaakko Lehtinen},
booktitle={International Conference on Learning Representations},
year={2018},
}

@inproceedings{CelebA,
title = {Deep Learning Face Attributes in the Wild},
author = {Liu, Ziwei and Luo, Ping and Wang, Xiaogang and Tang, Xiaoou},
booktitle = {Proceedings of International Conference on Computer Vision (ICCV)},
month = {December},
year = {2015}
}
```