Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/li-xirong/flickr8kcn

A bilingual dataset for image captioning
https://github.com/li-xirong/flickr8kcn

Last synced: 3 months ago
JSON representation

A bilingual dataset for image captioning

Awesome Lists containing this project

README

        

# Flickr8K-CN

Flickr8K-CN is a bilingual (English-to-Chinese) extension of the popular [Flickr8K](http://nlp.cs.illinois.edu/HockenmaierGroup/Framing_Image_Description/KCCA.html) set, used for evaluating image captioning in a cross-lingual setting.

| Chinese sentences | Flickr8k-train | Flickr8k-val | Flickr8k-test |
| -----:| -----:| -----:| -----:|
| human written | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| human translation | :x: | :x: | :white_check_mark: |
| machine translation (baidu) | :white_check_mark: | :white_check_mark: | :white_check_mark: |
| machine translation (google) | :white_check_mark: | :white_check_mark: | :white_check_mark: |

## Data

### Sentences

1. [Original English sentences](data/flickr8kenc.caption.txt)
2. [Chinese sentences written by native Chinese speakers](data/flickr8kzhc.caption.txt)
3. Chinese sentences generated by Baidu translation ([icmr2016 version](data/flickr8kzhb.caption.txt), [version 20160815](data/flickr8kzhb.caption.txt.v20160815))
4. Chinese sentences generated by Google translation ([icmr2016 version](data/flickr8kzhg.caption.txt), [version 20160816](data/flickr8kzhg.caption.txt.v20160816))
5. [Chinese sentences generated by human translation](data/flickr8kzhmtest.captions.txt) (only the test set is covered)

### Dataset split

* imageids of [6K training images](data/flickr8ktrain.txt), [1k validation images](data/flickr8kval.txt), [1k test images](data/flickr8ktest.txt)

### Image features

1. [1,024-dim GoogleNet pool5](http://lixirong.net/data/icmr2016/flickr8k-pygooglenet-pool5_7x7_s1.tar.gz), read by [bigfile.py](https://github.com/li-xirong/jingwei/blob/master/util/simpleknn/bigfile.py)

## Citations

1. Xirong Li, Weiyu Lan, Jianfeng Dong, Hailong Liu, [Adding Chinese Captions to Images](icmr2016_chisent.pdf), ACM ICMR 2016