https://github.com/princeton-vl/think_visually
Code for ACL 2018 paper 'Think Visually: Question Answering through Virtual Imagery'
https://github.com/princeton-vl/think_visually
acl dataset memory-network pretrained-models question-answering
Last synced: 20 days ago
JSON representation
Code for ACL 2018 paper 'Think Visually: Question Answering through Virtual Imagery'
- Host: GitHub
- URL: https://github.com/princeton-vl/think_visually
- Owner: princeton-vl
- License: bsd-3-clause
- Created: 2018-05-11T22:00:53.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2023-03-24T22:22:57.000Z (about 2 years ago)
- Last Synced: 2025-04-20T05:32:07.126Z (about 1 month ago)
- Topics: acl, dataset, memory-network, pretrained-models, question-answering
- Language: Python
- Homepage:
- Size: 22.5 KB
- Stars: 13
- Watchers: 5
- Forks: 2
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Training and Evaluation Code
[**Think Visually: Question Answering through Virtual Imagery**](http://bit.ly/think_visually_paper)
[Ankit Goyal](http://imankgoyal.github.io), [Jian Wang](http://jianwang.me/), [Jia Deng](https://www.cs.princeton.edu/~jiadeng/)
*Annual Meeting of the Association for Computational Linguistics (ACL), 2018*## Getting Started
First download/clone the repository. We would refer to the directory containing the code as ``.
```
git clone [email protected]:umich-vl/think_visually.git
```#### Requirements
Our current implementation only supports GPU so you need a GPU and need to have CUDA installed on your machine. We used Python version **3.5.3**, CUDA version **8.0.44** and cuDNN version **8.0-v5**.#### Install Libraries
We recommend to first install [Anaconda](https://anaconda.org/) and create a virtual environment.
```
conda create --name think_visually python=3.5
```Activate the virtual environment and install the libraries. Make sure you are in ``.
```
source activate think_visually
pip install -r requirements.txt
```#### Download Datasets and Pre-trained Models
Download all the folders [here](http://bit.ly/think_visually_acl_2018). Unzip them and put them in ``.## Code Organization
- `/model.py`: The main python script for creating model graph, training and testing.
- `/configs`: It contains various sample config files. `model.py` uses a config file to decide the model (`DSMN`/`DMN+`), the dataset used (`FloorPlanQA`/```ShapeIntersection```), various model parameters (like learning rate) etc. More information about the configuration files is present in `/configs/README.md`.
- `/results`: It contains all the pretrained models as well as training curves for the pre-trained models.
- `/utils`: It contains various utility files for data loading, preprecessing and common neural-net layers.
- `/data_FloorPlanQA`: It contains all the FloorPlanQA dataset. More information about various files in that folder is in `/data_FloorPlanQA/README.md`.
- `/data_ShapeIntersection`: It contains all the ShapeIntesection dataset. More information about various files in that folder is in `/data_ShapeIntesection/README.md`.
## Running Experiments
To train and evaluate a model use the `model.py` script with a config file.
```
python model.py
```For example, to load the pretrained `DSMN` model on the `FloorPlanQA` dataset and evaluate it, use the following command.
```
python model.py configs/DSMN_FloorPlanQA.yml
```Similarly to load the pretrained `DSMN` model on the `FloorPlanQA` dataset with 0.78125% partial suprevision, use the following command.
```
python model.py configs/DSMN_FloorPlanQA_sup_0.0078125.yml
```Note that in order to train from scratch you need to set the `pretrained` flag in the config file to 0. More information about how to set up a config file is in `/configs/README.md`.
**ADVICE**: As mentioned in the paper we found the `DMN+`/`DSMN` models to be unstable across runs. For consistent results, we recommend running the same model (with random initialization) atleast 10 / 20 times (you can use the run flag in the config file). The `DSMN#` model (i.e. `DSMN` with intermediate supervision) is relatively stable and requires less runs.
**UPDATE**: We reran all models on ShapeInterection so the results of the pretrained models are `+- 2%` of reported in the paper.