https://github.com/jovanavidenovic/DAM4SAM

Official Implementation of the paper: "A Distractor-Aware Memory for Visual Object Tracking with SAM2"
https://github.com/jovanavidenovic/DAM4SAM

Last synced: 8 months ago
JSON representation

Official Implementation of the paper: "A Distractor-Aware Memory for Visual Object Tracking with SAM2"

Host: GitHub
URL: https://github.com/jovanavidenovic/DAM4SAM
Owner: jovanavidenovic
Created: 2024-11-26T14:31:10.000Z (8 months ago)
Default Branch: master
Last Pushed: 2024-11-26T23:55:08.000Z (8 months ago)
Last Synced: 2024-11-27T00:28:17.124Z (8 months ago)
Homepage:
Size: 3.03 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

Awesome-Segment-Anything - [code

README

        


# A Distractor-Aware Memory for 
 Visual Object Tracking with SAM2

[Jovana Videnović](https://www.linkedin.com/in/jovana-videnovi%C4%87-5a5b08169/), [Alan Lukežič](https://www.vicos.si/people/alan_lukezic/), and [Matej Kristan](https://www.vicos.si/people/matej_kristan/)

Faculty of Computer and Information Science, University of Ljubljana

[[`Arxiv`](https://arxiv.org/abs/2411.17576)]  [[`Project page`](https://jovanavidenovic.github.io/dam-4-sam/) ] [[`DiDi dataset`](#didi-a-distractor-distilled-dataset)]

https://github.com/user-attachments/assets/e90158ba-5c02-489d-9401-26f77f0592b0

https://github.com/user-attachments/assets/0203a96a-c5c9-46f8-90d6-2445d2c5ad73



## Abstract

Memory-based trackers such as SAM2 demonstrate remarkable performance, however still struggle with distractors. We propose a new plug-in distractor-aware memory (DAM) and management strategy that substantially improves tracking robustness. The new model is demonstrated on SAM2.1, leading to SAM2.1++, which sets a new state-of-the-art on six benchmarks, including the most challenging VOT/S benchmarks without additional training. We also propose a new distractor-distilled (DiDi) dataset to better study the distractor problem. See the [preprint](https://arxiv.org/abs/2411.17576) for more details.

## Installation

To set up the repository locally, follow these steps:

1. Clone the repository and navigate to the project directory:

    ```bash

    git clone https://github.com/jovanavidenovic/DAM4SAM.git

    cd DAM4SAM

    ```

2. Create a new conda environment and activate it:

   ```bash

    conda create -n dam4sam_env python=3.10.15

    conda activate dam4sam_env

    ```

3. Install torch and other dependencies:

   ```bash

   pip install torch==2.1.0 torchvision==0.16.0 --index-url https://download.pytorch.org/whl/cu121

   pip install -r requirements.txt

   ```

If you experience problems as mentioned here, including `ImportError: cannot import name '_C' from 'sam2'`, run the following command in the repository root:

    ```

    python setup.py build_ext --inplace

    ```

Note that you can still use the repository even with the warning above, but some postprocessing SAM2 steps may be skipped. For more information, consult [SAM2 installation instructions]().

## Getting started

Model checkpoints can be downloaded by running:

```bash

cd checkpoints && \

./download_ckpts.sh 

```

Our model configs are available in `sam2/` folder. 

## Running and evaluation

This repository supports evaluation on the following datasets: DiDi, VOT2020, VOT2022, LaSot, LaSoText and GoT-10k. Support for running on VOTS2024 will be added soon. 

### DiDi dataset

Run on the whole dataset and save results to disk:

```bash

CUDA_VISIBLE_DEVICES=0 python run_on_didi.py --dataset_path  --output_dir 

```

Run on a single sequence and visualize results:

```bash

CUDA_VISIBLE_DEVICES=0 python run_on_didi.py --dataset_path  --sequence 

```

### VOT2020 and VOT2022 Challenges

Create VOT workspace (for more info see instructions [here](https://www.votchallenge.net/howto/)). For VOT2020 use:

```bash

vot initialize vot2020/shortterm --workspace 

```

and for VOT2022 use:

```bash

vot initialize vot2022/shortterm --workspace 

```

You can use integration files from `vot_integration/vot2022_st` folder to run only on the selected experiment. We provided two stack files: one for the baseline and one for the real-time experiments. After workspace creation and tracker integration you can evaluate the tracker on VOT using the following commands:

```bash

vot evaluate --workspace  DAM4SAM

vot analysis --workspace  --format=json DAM4SAM

vot report --workspace  --format=html DAM4SAM

```

### Bounding box datasets

Running our tracker is supported on LaSot, LaSoText and GoT-10k datasets. Tracker is initialized with masks, which are obtained using SAM2 image predictor, from ground truth initialization bounding boxes. You can download them for all datasets at [this link](https://data.vicos.si/alanl/sam2_init_masks.zip). Before running the tracker, set the corresponding paths to the datasets and the directory with ground truth masks in dam4sam_config.yaml (in the repo root directory).

Run on the whole dataset and save results to disk (arguments for the argument  can be: `got | lasot | lasot_ext`):

```bash

CUDA_VISIBLE_DEVICES=0 python run_on_box_dataset.py --dataset_name= --output_dir=

```

Run on a single sequence and visualize results:

```bash

CUDA_VISIBLE_DEVICES=0 python run_on_box_dataset.py --dataset_name= --sequence=

```

## DiDi: A distractor-distilled dataset

DiDi is a distractor-distilled tracking dataset created to address the limitation of low distractor presence in current visual object tracking benchmarks. To enhance the evaluation and analysis of tracking performance amidst distractors, we have semi-automatically distilled several existing benchmarks into the DiDi dataset. The dataset is available for download at [this link](https://go.vicos.si/didi).

  



  Example frames from the DiDi dataset showing challenging distractors. Targets are denoted by green bounding boxes.



### Experimental results on DiDi

See [the project page](https://jovanavidenovic.github.io/dam-4-sam/) for qualitative comparison.

| Model         | Quality | Accuracy | Robustness |

|---------------|---------|----------|------------|

| TransT        | 0.465   | 0.669    | 0.678      |

| KeepTrack     | 0.502   | 0.646    | 0.748      |

| SeqTrack      | 0.529   | 0.714    | 0.718      |

| AQATrack      | 0.535   | 0.693    | 0.753      |

| AOT           | 0.541   | 0.622    | 0.852      |

| Cutie         | 0.575   | 0.704    | 0.776      |

| ODTrack       | 0.608   | 0.740 :1st_place_medal:	 | 0.809    |

| SAM2.1Long    | 0.646   | 0.719    | 0.883      |

| SAM2.1   | 0.649 :3rd_place_medal:	 | 0.720    | 0.887 :3rd_place_medal:	 |

| SAMURAI       | 0.680 :2nd_place_medal:	  | 0.722 :3rd_place_medal:	   | 0.930 :2nd_place_medal:	    |

| **SAM2.1++** (ours) | 0.694 :1st_place_medal:	 | 0.727 :2nd_place_medal:	 | 0.944 :1st_place_medal:	 |

## Acknowledgments

Our work is built on top of [SAM 2](https://github.com/facebookresearch/sam2?tab=readme-ov-file) by Meta FAIR.

## Citation

If you find our work useful for your research, please consider giving us a star and citing our work.

```bibtex

@article{videnovic_dam4sam,

    title = {A Distractor-Aware Memory for Visual Object Tracking with SAM2},

    author = {Jovana Videnovic and Alan Lukezic and Matej Kristan},

    year={2024},

    journal = {arXiv:2411.17576}

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/jovanavidenovic/DAM4SAM

Awesome Lists containing this project

README