https://github.com/fartashf/vsepp

PyTorch Code for the paper "VSE++: Improving Visual-Semantic Embeddings with Hard Negatives"
https://github.com/fartashf/vsepp

bmvc negatives paper pytorch vse

Last synced: 7 months ago
JSON representation

PyTorch Code for the paper "VSE++: Improving Visual-Semantic Embeddings with Hard Negatives"

Host: GitHub
URL: https://github.com/fartashf/vsepp
Owner: fartashf
License: apache-2.0
Created: 2017-06-29T20:35:17.000Z (over 8 years ago)
Default Branch: master
Last Pushed: 2021-12-08T21:38:15.000Z (almost 4 years ago)
Last Synced: 2024-08-04T03:11:05.951Z (over 1 year ago)
Topics: bmvc, negatives, paper, pytorch, vse
Language: Python
Homepage:
Size: 51.8 KB
Stars: 487
Watchers: 15
Forks: 125
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

Awesome-pytorch-list-CNVersion - vsepp
Awesome-pytorch-list - vsepp

README

          # Improving Visual-Semantic Embeddings with Hard Negatives

Code for the image-caption retrieval methods from

**[VSE++: Improving Visual-Semantic Embeddings with Hard Negatives](https://arxiv.org/abs/1707.05612)**

*, F. Faghri, D. J. Fleet, J. R. Kiros, S. Fidler, Proceedings of the British Machine Vision Conference (BMVC),  2018. (BMVC Spotlight)*

## Dependencies

We recommended to use Anaconda for the following packages.

* Python 2.7 (Checkout branch `python3`)

* [PyTorch](http://pytorch.org/) (>0.2) (Checkout branch `pytorch4.1`)

* [NumPy](http://www.numpy.org/) (>1.12.1)

* [TensorBoard](https://github.com/TeamHG-Memex/tensorboard_logger)

* [pycocotools](https://github.com/cocodataset/cocoapi)

* [torchvision]()

* [matplotlib]()

* Punkt Sentence Tokenizer:

```python

import nltk

nltk.download()

> d punkt

```

## Download data

Download the dataset files and pre-trained models. We use splits produced by [Andrej Karpathy](http://cs.stanford.edu/people/karpathy/deepimagesent/). The precomputed image features are from [here](https://github.com/ryankiros/visual-semantic-embedding/) and [here](https://github.com/ivendrov/order-embedding). To use full image encoders, download the images from their original sources [here](http://nlp.cs.illinois.edu/HockenmaierGroup/Framing_Image_Description/KCCA.html), [here](http://shannon.cs.illinois.edu/DenotationGraph/) and [here](http://mscoco.org/).

```bash

wget http://www.cs.toronto.edu/~faghri/vsepp/vocab.tar

wget http://www.cs.toronto.edu/~faghri/vsepp/data.tar

wget http://www.cs.toronto.edu/~faghri/vsepp/runs.tar

```

We refer to the path of extracted files for `data.tar` as `$DATA_PATH` and 

files for `models.tar` as `$RUN_PATH`. Extract `vocab.tar` to `./vocab` 

directory.

*Update: The vocabulary was originally built using all sets (including test set 

captions). Please see issue #29 for details. Please consider not using test set 

captions if building up on this project.*

## Evaluate pre-trained models

```python

python -c "\

from vocab import Vocabulary

import evaluation

evaluation.evalrank('$RUN_PATH/coco_vse++/model_best.pth.tar', data_path='$DATA_PATH', split='test')"

```

To do cross-validation on MSCOCO, pass `fold5=True` with a model trained using 

`--data_name coco`.

## Training new models

Run `train.py`:

```bash

python train.py --data_path "$DATA_PATH" --data_name coco_precomp --logger_name 

runs/coco_vse++ --max_violation

```

Arguments used to train pre-trained models:

| Method    | Arguments |

| :-------: | :-------: |

| VSE0      | `--no_imgnorm` |

| VSE++     | `--max_violation` |

| Order0    | `--measure order --use_abs --margin .05 --learning_rate .001` |

| Order++   | `--measure order --max_violation` |

## Reference

If you found this code useful, please cite the following paper:

    @article{faghri2018vse++,

      title={VSE++: Improving Visual-Semantic Embeddings with Hard Negatives},

      author={Faghri, Fartash and Fleet, David J and Kiros, Jamie Ryan and Fidler, Sanja},

      booktitle = {Proceedings of the British Machine Vision Conference ({BMVC})},

      url = {https://github.com/fartashf/vsepp},

      year={2018}

    }

## License

[Apache License 2.0](http://www.apache.org/licenses/LICENSE-2.0)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/fartashf/vsepp

Awesome Lists containing this project

README