Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/google-research-datasets/T2-Guiding

T2 Guiding contains 1000 images. Each image is annotated with three Visual Genome objects obtained from a FRCNN and three image labels obtained from the Google Cloud Vision API. More information about this dataset can be found in the following paper: Edwin G. Ng, Bo Pang, Piyush Sharma and Radu Soricut. 2020. Understanding Guided Image Captioning Performance Across Domains. arXiv preprint arXiv:2012.02339
https://github.com/google-research-datasets/T2-Guiding

Last synced: 2 months ago
JSON representation

T2 Guiding contains 1000 images. Each image is annotated with three Visual Genome objects obtained from a FRCNN and three image labels obtained from the Google Cloud Vision API. More information about this dataset can be found in the following paper: Edwin G. Ng, Bo Pang, Piyush Sharma and Radu Soricut. 2020. Understanding Guided Image Captioning Performance Across Domains. arXiv preprint arXiv:2012.02339

Awesome Lists containing this project

README

        

# T2-Guiding

T2 Guiding is a dataset of 1000 images, each with six image labels. The images are from the Open Images Dataset (OID) and we provide 2 sets of machine-generated labels for these images.

1) Object labels: Three random object labels generated by a FRCNN model trained on Visual Genome.
2) Image labels: Three random image labels obtained from Google Cloud Vision API.

This dataset is used as the test set in the paper: "Understanding Guided Image Captioning Performance across Domains".

More details are available in this paper (please cite the paper if you use or discuss this dataset in your work):


@article{ng2020understanding,
title={Understanding Guided Image Captioning Performance across Domains},
author={Edwin G. Ng and Bo Pang and Piyush Sharma and Radu Soricut},
journal={arXiv preprint arXiv:2012.02339},
year={2020}
}

# Data Format

The released data is provided as a TSV (tab-separated values) text file with the following columns:

Table 1: Columns in TSV files.

| Column | Description |
| -------- | ------------------------------------------------------------------------------------------------------------------------- |
| 1 | Image key. The unique identifier of the image in the Open Images Dataset (a hexadecimal number. e.g., 0000d67245642c5f). |
| 2 | Visual Genome objects. Comma-separated list of object labels generated by a FRCNN trained on Visual Genome. |
| 3 | Image labels. Comma-separated list of image labels obtained from Google Cloud Vision API. |

# Downloads

The dataset is available for download here. The mapping from the image key to the image URL can be found in the cvpr2019.tsv.meta file of the original T2 dataset download link.