https://github.com/difhnp/MAT

code for 'Representation Learning for Visual Object Tracking by Masked Appearance Transfer'
https://github.com/difhnp/MAT

Last synced: 3 months ago
JSON representation

code for 'Representation Learning for Visual Object Tracking by Masked Appearance Transfer'

Host: GitHub
URL: https://github.com/difhnp/MAT
Owner: difhnp
License: gpl-3.0
Created: 2023-03-07T04:55:55.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2023-06-10T02:48:22.000Z (about 2 years ago)
Last Synced: 2024-11-02T05:32:46.530Z (8 months ago)
Language: Python
Size: 4.42 MB
Stars: 18
Watchers: 3
Forks: 4
Open Issues: 4
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

Awesome-Visual-Object-Tracking - [code

README

        # MAT

[Representation Learning for Visual Object Tracking by Masked Appearance Transfer](./misc/CVPR_23_MAT_Final.pdf)

---

![MAT](./misc/MAT.png)

## **Installation**

- Clone the repository locally

- Create a conda environment

```shell

conda env create -f env.yaml

conda activate mat

pip install --upgrade git+https://github.com/got-10k/toolkit.git@master

# (You can use `pipreqs ./root` to analyse the requirements of this project.)

```

## **Training**

- Prepare the training data:

  We use LMDB to store the training data. Please check `./data/parse_` and generate lmdb datasets.

- Specify the paths in `./lib/register/paths.py`.

All of our models are trained on a single machine with two RTX3090 GPUs. For distributed training on a single node with 2 GPUs:

- MAT pre-training

```shell

python -m torch.distributed.launch --nproc_per_node=2 train.py --experiment=translate_template --train_set=common_pretrain

```

- Tracker training

Modify the `cfg.model.backbone.weights` in `./config/cfg_translation_track.py` to be the last checkpoint of the MAT pre-training.

```shell

python -m torch.distributed.launch --nproc_per_node=2 train.py --experiment=translate_track --train_set=common

```

[//]: # ()

[//]: # (Arguments:)

[//]: # ()

[//]: # (- `-e` or `--experiment`:         the name of experiment -- check `./lib/register/experiments.py` to get more)

[//]: # (  information about each experiment.)

[//]: # (- `-t` or `--train_set`:          the name of train set -- check `./lib/register/dataset.py` to get more information)

[//]: # (  about each train set.)

[//]: # (- `--resume_epoch`:       resume from which epoch -- for example, `100` indicates we load `checkpoint_100.pth` and)

[//]: # (  resume training.)

[//]: # (- `--pretrain_name`:      the full name of the pre-trained model file -- for example, `checkpoint_100.pth` indicates we)

[//]: # (  load `./pretrain/checkpoint_100.pth`.)

[//]: # (- `--pretrain_lr_mult`:   pretrain_lr = pretrain_lr_mult * base_lr -- load pre-trained weights and fine tune these)

[//]: # (  parameters with `pretrain_lr`.)

[//]: # (- `--pretrain_exclude`:   the keyword of the name of pre-trained parameters that we want to discard -- for)

[//]: # (  example, `head` indicates we do not load the pre-trained weights whose name contains `head`.)

[//]: # (- `--gpu_id`:             CUDA_VISIBLE_DEVICES)

[//]: # (- `--find_unused`:        used in DDP mode)

[//]: # ()

[//]: # ()

## **Evaluation**

We provide a multi-process testing script for evaluation on several benchmarks.

Please modify the paths to your dataset in `./lib/register/paths.py`.

Download this [checkpoint](https://drive.google.com/file/d/1rQ_hWsd0ZlBax224V443M42aNGasfKdR/view?usp=share_link) and put it into `./checkpoints/translate_track_common/`, and download this [checkpoint](https://drive.google.com/file/d/1mWpQR_96GBcB4sAAcopkHJ9b8N9S1a3F/view?usp=sharing) and put it into `./checkpoints/translate_template_common_pretrain/`.

```shell

python test.py --gpu_id=0,1 --num_process=0 --experiment=translate_track --train_set=common --benchmark=lasot --vis

```

[//]: # ()

[//]: # (Arguments:)

[//]: # ()

[//]: # (- `-e` or `--experiment`:         the name of experiment -- check `./lib/register/experiments.py` to get more)

[//]: # (  information about each experiment.)

[//]: # (- `-t` or `--train_set`:          the name of train set -- check `./lib/register/dataset.py` to get more information)

[//]: # (  about each train set.)

[//]: # (- `-b` or `--benchmark`:          the name of benchmark -- check `./lib/register/benchmarks.py` to get more information)

[//]: # (  about each benchmark.)

[//]: # (- `--test_epoch`:         ckp of which epoch -- the default value is `300` indicates we load weights from the last epoch.)

[//]: # (- `--num_process`:        max processes each time, set 0 for single-process test.)

[//]: # (- `--gpu_id`:             CUDA_VISIBLE_DEVICES)

[//]: # (- `--vis`:                show tracking result.)

[//]: # ()

[//]: # ()

---

## Citation

```bibtex

@InProceedings{Zhao_2023_CVPR,

    author    = {Zhao, Haojie and Wang, Dong and Lu, Huchuan},

    title     = {Representation Learning for Visual Object Tracking by Masked Appearance Transfer},

    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},

    month     = {June},

    year      = {2023},

    pages     = {18696-18705}

}

```

## **Acknowledgments**

- Thanks for the great [MAE](https://github.com/facebookresearch/mae),

[MixFormer](https://github.com/MCG-NJU/MixFormer),

[pysot](https://github.com/STVIR/pysot).

- For data augmentation, we use [Albumentations](https://github.com/albumentations-team/albumentations).

## **License**

This work is released under the **GPL 3.0 license**. Please see the

LICENSE file for more information.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/difhnp/MAT

Awesome Lists containing this project

README