Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ap229997/Conditional-Batch-Norm
Pytorch implementation of NIPS 2017 paper "Modulating early visual processing by language"
https://github.com/ap229997/Conditional-Batch-Norm
cbn modulated-resnet pytorch vqa
Last synced: 3 months ago
JSON representation
Pytorch implementation of NIPS 2017 paper "Modulating early visual processing by language"
- Host: GitHub
- URL: https://github.com/ap229997/Conditional-Batch-Norm
- Owner: ap229997
- License: mit
- Created: 2018-01-31T13:26:08.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2019-02-23T09:07:51.000Z (over 5 years ago)
- Last Synced: 2024-07-22T23:47:33.920Z (3 months ago)
- Topics: cbn, modulated-resnet, pytorch, vqa
- Language: Python
- Homepage:
- Size: 40 KB
- Stars: 61
- Watchers: 6
- Forks: 11
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-normalization-techniques - [Python Reference
README
## Conditional Batch Normalization
Pytorch implementation of NIPS 2017 paper "Modulating early visual processing by language"
[[Link]](https://papers.nips.cc/paper/7237-modulating-early-visual-processing-by-language.pdf)### Introduction
The authors present a novel approach to incorporate language information into extracting visual features by conditioning the Batch Normalization parameters on the language. They apply Conditional Batch Normalization (CBN) to a pre-trained ResNet and show that this significantly improves performance on visual question answering tasks.### Setup
This repository is compatible with python 2.
- Follow instructions outlined on [PyTorch Homepage](https://pytorch.org/) for installing PyTorch (Python2).
- The python packages required are ``` nltk ``` ``` tqdm ``` which can be installed using pip.### Data
To download the VQA dataset please use the script 'scripts/vqa_download.sh':
```
scripts/vqa_download.sh `pwd`/data
```### Process Data
Detailed instructions for processing data are provided by [GuessWhatGame/vqa](https://github.com/GuessWhatGame/vqa#introduction).#### Create dictionary
To create the VQA dictionary, use the script preprocess_data/create_dico.py.
```
python preprocess_data/create_dictionary.py --data_dir data --year 2014 --dict_file dict.json
```#### Create GLOVE dictionary
To create the GLOVE dictionary, download the original glove file and run the script preprocess_data/create_gloves.py.
```
wget http://nlp.stanford.edu/data/glove.42B.300d.zip -P data/
unzip data/glove.42B.300d.zip -d data/
python preprocess_data/create_gloves.py --data_dir data --glove_in data/glove.42B.300d.txt --glove_out data/glove_dict.pkl --year 2014
```### Train Model
To train the network, set the required parameters in ``` config.json ``` and run the script main.py.
```
python main.py --gpu gpu_id --data_dir data --img_dir images --config config.json --exp_dir exp --year 2014
```### Citation
If you find this code useful, please consider citing the original work by authors:
```
@inproceedings{de2017modulating,
author = {Harm de Vries and Florian Strub and J\'er\'emie Mary and Hugo Larochelle and Olivier Pietquin and Aaron C. Courville},
title = {Modulating early visual processing by language},
booktitle = {Advances in Neural Information Processing Systems 30},
year = {2017}
url = {https://arxiv.org/abs/1707.00683}
}
```