Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/billpsomas/metrix

This repo contains the official implementation of ICLR 2022 paper "It Takes Two to Tango: Mixup for Deep Metric Learning".
https://github.com/billpsomas/metrix

Last synced: 3 months ago
JSON representation

This repo contains the official implementation of ICLR 2022 paper "It Takes Two to Tango: Mixup for Deep Metric Learning".

Awesome Lists containing this project

README

        

# It Takes Two to Tango: Mixup for Deep Metric Learning

This repo contains the official PyTorch implementation and pretrained models of our ICLR 2022 paper - **It Takes Two to Tango: Mixup for Deep Metric Learning**. [[`arXiv`](https://arxiv.org/abs/2106.04990)] [[`OpenReview`](https://openreview.net/forum?id=ZKy2X3dgPA)] [[`video`](https://iclr.cc/virtual/2022/poster/6337)] [[`slides`](.github/slides.pdf)] [[`poster`](.github/poster.pdf)]


Metrix illustration

## Datasets
Please download:

- [CUB](https://data.caltech.edu/records/65de6-vp158/files/CUB_200_2011.tgz?download=1)
- Cars [images](http://ai.stanford.edu/~jkrause/car196/car_ims.tgz), [annotations](http://ai.stanford.edu/~jkrause/car196/cars_annos.mat)
- [SOP](https://cvgl.stanford.edu/projects/lifted_struct/)
- [InShop](https://drive.google.com/drive/folders/0B7EVK8r0v71pVDZFQXRsMDZCX1E?resourcekey=0-4R4v6zl4CWhHTsUGOsTstw&usp=share_link)

Extract the .tgz or .zip file into the same folder, e.g. `./datasets/`. You should have a folder structure like this:

- datasets
- CUB_200_2011
- cars196
- Stanford_Online_Products
- InShop_Clothes

---
## Training
### Installation
Please install [PyTorch](https://pytorch.org/). The experiments have been performed with Python version 3.7.6, PyTorch version 1.7.0, CUDA 10.1 and torchvision 0.8.1.

The requirements are easily installed via
[Anaconda](https://www.anaconda.com/distribution/#download-section). Here we create a conda environment called `metrix` and install all the necessary libraries:

```bash
conda create -n metrix python=3.7.6
conda activate metrix
conda install pytorch==1.7.0 torchvision==0.8.1 cudatoolkit=10.1 pillow==8.0.1 -c pytorch
pip install scikit-learn==0.23.2 munkres==1.1.4 tqdm==4.62.3 scipy==1.7.3 pytorch_metric_learning==1.3.0
```

---

### Baseline Contrastive
Train baseline Contrastive with ResNet-50 and an embedding size of 512 for 60 epochs on CUB dataset:

```bash
python3 main.py --dataset cub --data_root /path/to/datasets/ --save_root /path/to/output/ --batch_size 100 --num_workers 4 --embedding_size 512 --num_epochs 60 --lr 1e-4 --lr_decay_gamma 0.1 --loss contrastive --mode baseline --alpha 2.0 --save_model True
```

Train baseline Contrastive with ResNet-50 and an embedding size of 512 for 60 epochs on Cars dataset:

```bash
python3 main.py --dataset cars --data_root /path/to/datasets/ --save_root /path/to/output/ --batch_size 100 --num_workers 4 --embedding_size 512 --num_epochs 60 --lr 1e-4 --lr_decay_gamma 0.1 --loss contrastive --mode baseline --alpha 2.0 --save_model True
```

Train baseline Contrastive with ResNet-50 and an embedding size of 512 for 60 epochs on SOP dataset:

```bash
python3 main.py --dataset sop --batch_size 100 --num_workers 4 --embedding_size 512 --num_epochs 60 --lr 3e-5 --lr_decay_gamma 0.25 --bn_freeze 0 --loss contrastive --images_per_class 5 --mode baseline --alpha 2.0 --save_model True
```

Train baseline Contrastive with ResNet-50 and an embedding size of 512 for 60 epochs on InShop dataset:

```bash
python3 main.py --dataset sop --batch_size 100 --num_workers 4 --embedding_size 512 --num_epochs 60 --lr 3e-5 --lr_decay_step 5 --lr_decay_gamma 0.25 --warm 1 --bn_freeze 0 --loss contrastive --images_per_class 5 --mode baseline --alpha 2.0 --save_model True
```

Note that the argument `--mode` has been set to `baseline` here, as we are running the baseline contrastive experiments.

---

### NOTE
`Metrix`, our Mixup for Deep Metric Learning method, can be performed on `input`, `feature` or `embedding` space. In our paper, we show that Metrix on feature space performs the best. For simplification we call this `Metrix` instead of `Metrix/feature`. Metrix on input space is called `Metrix/input`, while Metrix on embedding space is called `Metrix/embed`. In general, `Metrix/input` is not computationally efficient (because the mixup takes place between images), while `Metrix/embed` is very efficient (because the mixup takes place between low-dimensional vectors).


Metrix illustration

---

### Contrastive + Metrix
Train Contrastive + Metrix with ResNet-50 and an embedding size of 512 for 60 epochs on CUB dataset:

```bash
python3 main.py --dataset cub --data_root /path/to/datasets/ --save_root /path/to/output/ --batch_size 100 --num_workers 4 --embedding_size 512 --num_epochs 60 --lr 1e-4 --lr_decay_gamma 0.1 --loss contrastive --mode feature --alpha 2.0 --save_model True
```

Train Contrastive + Metrix with ResNet-50 and an embedding size of 512 for 60 epochs on Cars dataset:

```bash
python3 main.py --dataset cars --data_root /path/to/datasets/ --save_root /path/to/output/ --batch_size 100 --num_workers 4 --embedding_size 512 --num_epochs 60 --lr 1e-4 --lr_decay_gamma 0.1 --loss contrastive --mode feature --alpha 2.0 --save_model
```

Train Contrastive + Metrix with ResNet-50 and an embedding size of 512 for 60 epochs on SOP dataset:

```bash
python3 main.py --dataset sop --data_root /path/to/datasets/ --save_root /path/to/output/ --batch_size 100 --num_workers 4 --embedding_size 512 --num_epochs 60 --lr 3e-5 --lr_decay_gamma 0.25 --loss contrastive --images_per_class 5 --mode feature --alpha 2.0 --save_model
```

Train Contrastive + Metrix with ResNet-50 and an embedding size of 512 for 60 epochs on InShop dataset:

```bash
python3 main.py --dataset inshop --data_root /path/to/datasets/ --save_root /path/to/output/ --batch_size 100 --num_workers 4 --embedding_size 512 --num_epochs 60 --lr 1e-4 --lr_decay_gamma 0.25 --loss contrastive --images_per_class 5 --mode feature --alpha 2.0 --save_model
```

> For the **Contrastive + Metrix/input** or **Contrastive + Metrix/embed**, modify `--mode input` or `--mode embed` respectively.

Below we present the expected results per method and dataset:



Method
CUB200
CARS196
SOP
IN-SHOP


R@1
R@2
R@4
R@1
R@2
R@4
R@1
R@10
R@100
R@1
R@10
R@20




Baseline Contrastive
64.7
75.9
84.6
81.6
88.2
92.7
74.9
87.0
93.9
86.4
94.7
96.2


Contrastive + Metrix
67.4
77.9
85.7
85.1
91.1
94.6
77.5
89.1
95.5
89.1
95.7
97.1


Contrastive + Metrix/input
66.3
77.1
85.2
82.9
89.3
93.7
75.8
87.8
94.6
87.7
95.9
96.5


Contrastive + Metrix/embed
66.4
77.6
85.4
83.9
90.3
94.1
76.7
88.6
95.2
88.4
95.4
95.8

---
### Baseline MultiSimilarity
Train baseline MultiSimilarity with ResNet-50 and an embedding size of 512 for 60 epochs on CUB dataset:

```bash
python3 main.py --dataset cub --data_root /path/to/datasets/ --save_root /path/to/output/ --batch_size 100 --num_workers 4 --embedding_size 512 --num_epochs 60 --lr 1e-4 --lr_decay_step 5 --lr_decay_gamma 0.5 --loss multisimilarity --mode baseline --alpha 2.0 --save_model
```

Train baseline MultiSimilarity with ResNet-50 and an embedding size of 512 for 60 epochs on Cars dataset:

```bash
python3 main.py --dataset cars --data_root /path/to/datasets/ --save_root /path/to/output/ --batch_size 100 --num_workers 4 --embedding_size 512 --num_epochs 60 --lr 1e-4 --lr_decay_step 5 --lr_decay_gamma 0.5 --loss multisimilarity --mode baseline --alpha 2.0 --save_model
```

Train baseline MultiSimilarity with ResNet-50 and an embedding size of 512 for 60 epochs on SOP dataset:

```bash
python3 main.py --dataset sop --data_root /path/to/datasets/ --save_root /path/to/output/ --batch_size 100 --num_workers 4 --embedding_size 512 --num_epochs 60 --lr 6e-4 --lr_decay_step 20 --lr_decay_gamma 0.25 --warm 1 --images_per_class 5 --bn_freeze 0 --loss multisimilarity --mode baseline --alpha 2.0 --save_model
```

Train baseline MultiSimilarity with ResNet-50 and an embedding size of 512 for 60 epochs on InShop dataset:

```bash
python3 main.py --dataset inshop --data_root /path/to/datasets/ --save_root /path/to/output/ --batch_size 100 --num_workers 4 --embedding_size 512 --num_epochs 60 --lr 6e-4 --lr_decay_step 20 --lr_decay_gamma 0.25 --warm 1 --images_per_class 5 --bn_freeze 0 --loss multisimilarity --mode baseline --alpha 2.0 --save_model
```

---
### MultiSimilarity + Metrix
Train MultiSimilarity + Metrix with ResNet-50 and an embedding size of 512 for 60 epochs on CUB dataset:

```bash
python3 main.py --dataset cub --data_root /path/to/datasets/ --save_root /path/to/output/ --batch_size 100 --num_workers 4 --embedding_size 512 --num_epochs 100 --lr 1e-4 --lr_decay_gamma 0.5 --loss multisimilarity --mode feature --alpha 2.0 --save_model
```

> For the **MultiSimilarity + Metrix/input** or **MultiSimilarity + Metrix/embed**, modify `--mode input` or `--mode embed` respectively.

> For Cars, SOP or InShop datasets, modify `--dataset cars` ,`--dataset sop` or `--dataset inshop` respectively, using the same hyperparameters as in the respective baseline experiment.

Below we present the expected results per method and dataset:



Method
CUB200
CARS196
SOP
IN-SHOP


R@1
R@2
R@4
R@1
R@2
R@4
R@1
R@10
R@100
R@1
R@10
R@20




Baseline MultiSimilarity
67.8
77.8
85.6
87.8
92.7
95.3
76.9
89.8
95.9
90.1
97.6
98.4


MultiSimilarity + Metrix
71.4
80.6
86.8
89.6
94.2
96.0
81.0
92.0
97.2
92.2
98.5
98.6


MultiSimilarity + Metrix/input
69.0
79.1
86.0
89.0
93.4
96.0
77.9
90.6
95.9
91.8
98.0
98.9


MultiSimilarity + Metrix/embed
70.2
80.4
86.7
88.8
92.9
95.6
78.5
91.3
96.7
91.9
98.3
98.7

---
### Baseline ProxyAnchor
Train baseline ProxyAnchor with ResNet-50 and an embedding size of 512 for 60 epochs on CUB dataset:

```bash
python3 main.py --dataset cub --data_root /path/to/datasets/ --save_root /path/to/output/ --batch_size 100 --num_workers 4 --embedding_size 512 --num_epochs 60 --lr 1e-4 --lr_decay_step 5 --lr_decay_gamma 0.5 --loss proxyanchor --mode baseline --alpha 2.0 --save_model
```

Train baseline ProxyAnchor with ResNet-50 and an embedding size of 512 for 60 epochs on Cars dataset:

```bash
python3 main.py --dataset cars --data_root /path/to/datasets/ --save_root /path/to/output/ --batch_size 100 --num_workers 4 --embedding_size 512 --num_epochs 60 --lr 1e-4 --lr_decay_step 5 --lr_decay_gamma 0.5 --loss proxyanchor --mode baseline --alpha 2.0 --save_model
```

Train baseline ProxyAnchor with ResNet-50 and an embedding size of 512 for 60 epochs on SOP dataset:

```bash
python3 main.py --dataset sop --data_root /path/to/datasets/ --save_root /path/to/output/ --batch_size 100 --num_workers 4 --embedding_size 512 --num_epochs 60 --lr 6e-4 --lr_decay_step 20 --lr_decay_gamma 0.25 --warm 1 --images_per_class 5 --bn_freeze 0 --loss proxyanchor --mode baseline --alpha 2.0 --save_model
```

Train baseline ProxyAnchor with ResNet-50 and an embedding size of 512 for 60 epochs on InShop dataset:

```bash
python3 main.py --dataset inshop --data_root /path/to/datasets/ --save_root /path/to/output/ --batch_size 100 --num_workers 4 --embedding_size 512 --num_epochs 60 --lr 6e-4 --lr_decay_step 20 --lr_decay_gamma 0.25 --warm 1 --images_per_class 5 --bn_freeze 0 --loss proxyanchor --mode baseline --alpha 2.0 --save_model
```

---

### ProxyAnchor + Metrix
Train Contrastive + Metrix with ResNet-50 for 100 epochs on CUB dataset:

```bash
python3 main.py --dataset cub --data_root /path/to/datasets/ --save_root /path/to/output/ --batch_size 100 --num_workers 4 --embedding_size 512 --num_epochs 100 --lr 1e-4 --lr_decay_gamma 0.5 --loss proxyanchor --mode feature --alpha 2.0 --save_model
```

> For the **ProxyAnchor + Metrix/input** or **ProxyAnchor + Metrix/embed**, modify `--mode input` or `--mode embed` respectively.

> For Cars, SOP or InShop datasets, modify `--dataset cars` ,`--dataset sop` or `--dataset inshop` respectively, using the same hyperparameters as in the respective baseline experiment.

Below we present the expected results per method and dataset:



Method
CUB200
CARS196
SOP
IN-SHOP


R@1
R@2
R@4
R@1
R@2
R@4
R@1
R@10
R@100
R@1
R@10
R@20




Baseline ProxyAnchor
69.5
79.3
87.0
87.6
92.3
95.5
79.1
90.8
96.2
90.0
97.4
98.2


ProxyAnchor + Metrix
71.0
81.8
88.2
89.1
93.6
96.7
81.3
91.7
96.9
91.9
98.2
98.8


ProxyAnchor + Metrix/input
70.5
81.2
87.8
88.2
93.2
96.2
79.8
91.4
96.5
90.9
98.1
98.4


ProxyAnchor + Metrix/embed
70.4
81.1
87.9
88.9
93.3
96.4
80.6
91.7
96.6
91.6
98.3
98.3

---

### Common Errors
If you face any errors, don't hesitate to open an issue. We will highlight them here.

## Acknowledgement

This repository is built using the [Proxy Anchor](https://github.com/tjddus9597/Proxy-Anchor-CVPR2020), [PyTorch Metric Learning](https://github.com/KevinMusgrave/pytorch-metric-learning) and [DML Benchmark](https://github.com/billpsomas/Deep_Metric_Learning_Pytorch).

## License
This repository is released under the MIT License as found in the [LICENSE](LICENSE) file.

## Citation
If you find this repository useful, please consider giving a star :star: and citation:
```
@inproceedings{
venkataramanan2022it,
title={It Takes Two to Tango: Mixup for Deep Metric Learning},
author={Shashanka Venkataramanan and Bill Psomas and Ewa Kijak and laurent amsaleg and Konstantinos Karantzalos and Yannis Avrithis},
booktitle={International Conference on Learning Representations},
year={2022},
url={https://openreview.net/forum?id=ZKy2X3dgPA}
}
```