Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/google-research-datasets/T2-Guiding
T2 Guiding contains 1000 images. Each image is annotated with three Visual Genome objects obtained from a FRCNN and three image labels obtained from the Google Cloud Vision API. More information about this dataset can be found in the following paper: Edwin G. Ng, Bo Pang, Piyush Sharma and Radu Soricut. 2020. Understanding Guided Image Captioning Performance Across Domains. arXiv preprint arXiv:2012.02339
https://github.com/google-research-datasets/T2-Guiding
Last synced: about 2 months ago
JSON representation
T2 Guiding contains 1000 images. Each image is annotated with three Visual Genome objects obtained from a FRCNN and three image labels obtained from the Google Cloud Vision API. More information about this dataset can be found in the following paper: Edwin G. Ng, Bo Pang, Piyush Sharma and Radu Soricut. 2020. Understanding Guided Image Captioning Performance Across Domains. arXiv preprint arXiv:2012.02339
- Host: GitHub
- URL: https://github.com/google-research-datasets/T2-Guiding
- Owner: google-research-datasets
- License: other
- Created: 2020-12-14T15:36:13.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2020-12-28T21:35:10.000Z (almost 4 years ago)
- Last Synced: 2024-07-22T14:39:57.549Z (5 months ago)
- Size: 5.86 KB
- Stars: 5
- Watchers: 8
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-diverse-captioning - [unfinishe-code
README
# T2-Guiding
T2 Guiding is a dataset of 1000 images, each with six image labels. The images are from the Open Images Dataset (OID) and we provide 2 sets of machine-generated labels for these images.
1) Object labels: Three random object labels generated by a FRCNN model trained on Visual Genome.
2) Image labels: Three random image labels obtained from Google Cloud Vision API.This dataset is used as the test set in the paper: "Understanding Guided Image Captioning Performance across Domains".
More details are available in this paper (please cite the paper if you use or discuss this dataset in your work):
@article{ng2020understanding,
title={Understanding Guided Image Captioning Performance across Domains},
author={Edwin G. Ng and Bo Pang and Piyush Sharma and Radu Soricut},
journal={arXiv preprint arXiv:2012.02339},
year={2020}
}# Data Format
The released data is provided as a TSV (tab-separated values) text file with the following columns:
Table 1: Columns in TSV files.
| Column | Description |
| -------- | ------------------------------------------------------------------------------------------------------------------------- |
| 1 | Image key. The unique identifier of the image in the Open Images Dataset (a hexadecimal number. e.g., 0000d67245642c5f). |
| 2 | Visual Genome objects. Comma-separated list of object labels generated by a FRCNN trained on Visual Genome. |
| 3 | Image labels. Comma-separated list of image labels obtained from Google Cloud Vision API. |# Downloads
The dataset is available for download here. The mapping from the image key to the image URL can be found in the cvpr2019.tsv.meta file of the original T2 dataset download link.