An open API service indexing awesome lists of open source software.

https://github.com/shunk031/huggingface-datasets_stair-captions

STAIR Captions: Constructing a Large-Scale Japanese Image Caption Dataset for huggingface datasets
https://github.com/shunk031/huggingface-datasets_stair-captions

Last synced: 8 months ago
JSON representation

STAIR Captions: Constructing a Large-Scale Japanese Image Caption Dataset for huggingface datasets

Awesome Lists containing this project

README

          

---
annotations_creators:
- crowdsourced
language:
- ja
language_creators:
- found
license:
- cc-by-4.0
multilinguality:
- monolingual
pretty_name: STAIR Captions is a large-scale dataset containing 820,310 Japanese captions.
size_categories:
- 100K

### Languages

The language data in JDocQA is in Japanese ([BCP-47 ja-JP](https://www.rfc-editor.org/info/bcp47)).

## Dataset Structure

### Data Instances

[More Information Needed]

### Data Fields

[More Information Needed]

### Data Splits

[More Information Needed]

## Dataset Creation

### Curation Rationale

[More Information Needed]

### Source Data

[More Information Needed]

#### Initial Data Collection and Normalization

[More Information Needed]

#### Who are the source language producers?

[More Information Needed]

### Annotations

[More Information Needed]

#### Annotation process

[More Information Needed]

#### Who are the annotators?

[More Information Needed]

### Personal and Sensitive Information

[More Information Needed]

## Considerations for Using the Data

### Social Impact of Dataset

[More Information Needed]

### Discussion of Biases

[More Information Needed]

### Other Known Limitations

[More Information Needed]

## Additional Information

### Dataset Curators

[More Information Needed]

### Licensing Information

[Creative Commons Attribution 4.0 License.](https://creativecommons.org/licenses/by/4.0/legalcode)

### Citation Information

```bibtex
@inproceedings{yoshikawa2017stair,
title={STAIR Captions: Constructing a Large-Scale Japanese Image Caption Dataset},
author={Yoshikawa, Yuya and Shigeto, Yutaro and Takeuchi, Akikazu},
booktitle={Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
pages={417--421},
year={2017}
}
```

### Contributions

Thanks to [@yuyay](https://github.com/yuyay) for creating this dataset.