Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/vzhou842/easy-vqa

The Easy Visual Question Answering dataset.
https://github.com/vzhou842/easy-vqa

dataset easy-vqa visual-question-answering vqa vqa-dataset

Last synced: 14 days ago
JSON representation

The Easy Visual Question Answering dataset.

Awesome Lists containing this project

README

        

# easy-vqa

[![Build Status](https://travis-ci.com/vzhou842/easy-vqa.svg?branch=master)](https://travis-ci.com/vzhou842/easy-vqa)
![PyPI](https://img.shields.io/pypi/v/easy-vqa)

The official repository for the Easy Visual Question Answering (easy-VQA) dataset. Contains:
- the official [Python package](https://pypi.org/project/easy-vqa/) for the dataset
- the source code for generating the dataset

Read [the easy-VQA blog post](https://victorzhou.com/blog/easy-vqa/) for more.

## About the Dataset

easy-VQA contains

- 4,000 train images and 38,575 train questions.
- 1,000 test images and 9,673 test questions.
- 13 total possible answers.
- 28,407 training questions that are yes/no.
- 7,136 testing questions that are yes/no.

All images are 64x64 color images. See a [live demo](https://easy-vqa-demo.victorzhou.com/) of a model trained on the dataset.

### Example Images

![](./easy_vqa/data/train/images/0.png)
![](./easy_vqa/data/train/images/1.png)
![](./easy_vqa/data/train/images/2.png)
![](./easy_vqa/data/train/images/3.png)
![](./easy_vqa/data/train/images/5.png)
![](./easy_vqa/data/train/images/6.png)
![](./easy_vqa/data/train/images/7.png)
![](./easy_vqa/data/train/images/8.png)

_(these image links above only work on [Github](https://github.com/vzhou842/easy-VQA))_

### Example Questions

- _What color is the rectangle?_
- _Does the image contain a triangle?_
- _Is no blue shape present?_
- _What shape does the image contain?_

## Installing the Package

`pip install easy-vqa`

## Using the Package

### Questions

Each question has 3 parts:
- the **question text**
- the **answer**
- the **image ID**

The question getters return corresponding **arrays** for each of the 3 parts:

```python
from easy_vqa import get_train_questions, get_test_questions

train_questions, train_answers, train_image_ids = get_train_questions()
test_questions, test_answers, test_image_ids = get_test_questions()

# Question 0 is at index 0 for all 3 arrays:
print(train_questions[0]) # what shape does the image contain?
print(train_answers[0]) # circle
print(train_image_ids[0]) # 0
```

### Images

The image path getters return dicts that map **image ID** to **absolute paths** that can be used to load the image.

```python
from easy_vqa import get_train_image_paths, get_test_image_paths

train_image_paths = get_train_image_paths()
test_image_paths = get_test_image_paths()

print(train_image_paths[0]) # ends in easy_vqa/data/train/images/0.png
```

### Answers

The answers getter returns an **array** of all possible answers.

```python
from easy_vqa import get_answers

answers = get_answers()

print(answers) # ['teal', 'brown', 'black', 'gray', 'yes', 'blue', 'rectangle', 'yellow', 'triangle', 'red', 'circle', 'no', 'green']
```

## Generating the Dataset

The easy-VQA dataset was generated by running

```shell
python gen_data/generate_data.py
```

which writes to the `easy_vqa/data/` directory. Be sure to install the dependencies for dataset generation running `generate_data.py`:

```shell
pip install -r gen_data/requirements.txt
```

If you want to generate a larger easy-VQA dataset, simply modify the `NUM_TRAIN` and `NUM_TEST` constants in `generate_data.py`. Otherwise, if you want to modify the dataset itself, the files and code in the `gen_data/` directory should be pretty self-explanatory.