https://github.com/timmeinhardt/trackformer

Implementation of "TrackFormer: Multi-Object Tracking with Transformers”. [Conference on Computer Vision and Pattern Recognition (CVPR), 2022]
https://github.com/timmeinhardt/trackformer

Last synced: 2 months ago
JSON representation

Implementation of "TrackFormer: Multi-Object Tracking with Transformers”. [Conference on Computer Vision and Pattern Recognition (CVPR), 2022]

Host: GitHub
URL: https://github.com/timmeinhardt/trackformer
Owner: timmeinhardt
License: apache-2.0
Created: 2021-02-11T12:03:03.000Z (over 4 years ago)
Default Branch: main
Last Pushed: 2023-09-12T14:52:40.000Z (almost 2 years ago)
Last Synced: 2024-08-03T22:14:03.481Z (12 months ago)
Language: Python
Homepage: https://arxiv.org/abs/2101.02702
Size: 35 MB
Stars: 490
Watchers: 15
Forks: 113
Open Issues: 62
Metadata Files:
- Readme: README.md
- Contributing: .github/CONTRIBUTING.md
- License: LICENSE
- Code of conduct: .github/CODE_OF_CONDUCT.md

Awesome Lists containing this project

awesome-multiple-object-tracking - [code

README

        # TrackFormer: Multi-Object Tracking with Transformers

This repository provides the official implementation of the [TrackFormer: Multi-Object Tracking with Transformers](https://arxiv.org/abs/2101.02702) paper by [Tim Meinhardt](https://dvl.in.tum.de/team/meinhardt/), [Alexander Kirillov](https://alexander-kirillov.github.io/), [Laura Leal-Taixe](https://dvl.in.tum.de/team/lealtaixe/) and [Christoph Feichtenhofer](https://feichtenhofer.github.io/). The codebase builds upon [DETR](https://github.com/facebookresearch/detr), [Deformable DETR](https://github.com/fundamentalvision/Deformable-DETR) and [Tracktor](https://github.com/phil-bergmann/tracking_wo_bnw).



    

    



## Abstract

The challenging task of multi-object tracking (MOT) requires simultaneous reasoning about track initialization, identity, and spatiotemporal trajectories.

We formulate this task as a frame-to-frame set prediction problem and introduce TrackFormer, an end-to-end MOT approach based on an encoder-decoder Transformer architecture.

Our model achieves data association between frames via attention by evolving a set of track predictions through a video sequence.

The Transformer decoder initializes new tracks from static object queries and autoregressively follows existing tracks in space and time with the new concept of identity preserving track queries.

Both decoder query types benefit from self- and encoder-decoder attention on global frame-level features, thereby omitting any additional graph optimization and matching or modeling of motion and appearance.

TrackFormer represents a new tracking-by-attention paradigm and yields state-of-the-art performance on the task of multi-object tracking (MOT17) and segmentation (MOTS20).



    



## Installation

We refer to our [docs/INSTALL.md](docs/INSTALL.md) for detailed installation instructions.

## Train TrackFormer

We refer to our [docs/TRAIN.md](docs/TRAIN.md) for detailed training instructions.

## Evaluate TrackFormer

In order to evaluate TrackFormer on a multi-object tracking dataset, we provide the `src/track.py` script which supports several datasets and splits interchangle via the `dataset_name` argument (See `src/datasets/tracking/factory.py` for an overview of all datasets.) The default tracking configuration is specified in `cfgs/track.yaml`. To facilitate the reproducibility of our results, we provide evaluation metrics for both the train and test set.

### MOT17

#### Private detections

```

python src/track.py with reid

```

| MOT17     | MOTA         | IDF1           |       MT     |     ML     |     FP       |     FN              |  ID SW.      |

|  :---:    | :---:        |     :---:      |    :---:     | :---:      |    :---:     |   :---:             |  :---:       |

| **Train** |     74.2     |     71.7       |     849      | 177        |      7431    |      78057          |  1449        |

| **Test**  |     74.1     |     68.0       |    1113      | 246        |     34602    |     108777          |  2829        |

#### Public detections (DPM, FRCNN, SDP)

```

python src/track.py with \

    reid \

    tracker_cfg.public_detections=min_iou_0_5 \

    obj_detect_checkpoint_file=models/mot17_deformable_multi_frame/checkpoint_epoch_50.pth

```

| MOT17     | MOTA         | IDF1           |       MT     |     ML     |     FP       |     FN              |  ID SW.      |

|  :---:    | :---:        |     :---:      |    :---:     | :---:      |    :---:     |   :---:             |  :---:       |

| **Train** |     64.6     |     63.7       |    621       | 675        |     4827     |     111958          |  2556        |

| **Test**  |     62.3     |     57.6       |    688       | 638        |     16591    |     192123          |  4018        |

### MOT20

#### Private detections

```

python src/track.py with \

    reid \

    dataset_name=MOT20-ALL \

    obj_detect_checkpoint_file=models/mot20_crowdhuman_deformable_multi_frame/checkpoint_epoch_50.pth

```

| MOT20     | MOTA         | IDF1           |       MT     |     ML     |     FP       |     FN              |  ID SW.      |

|  :---:    | :---:        |     :---:      |    :---:     | :---:      |    :---:     |   :---:             |  :---:       |

| **Train** |     81.0     |     73.3       |    1540      | 124        |     20807    |     192665          |  1961        |

| **Test**  |     68.6     |     65.7       |     666      | 181        |     20348    |     140373          |  1532        |

### MOTS20

```

python src/track.py with \

    dataset_name=MOTS20-ALL \

    obj_detect_checkpoint_file=models/mots20_train_masks/checkpoint.pth

```

Our tracking script only applies MOT17 metrics evaluation but outputs MOTS20 mask prediction files. To evaluate these download the official [MOTChallengeEvalKit](https://github.com/dendorferpatrick/MOTChallengeEvalKit).

| MOTS20    | sMOTSA         | IDF1           |       FP     |     FN     |     IDs      |

|  :---:    | :---:          |     :---:      |    :---:     | :---:      |    :---:     |

| **Train** |     --         |     --         |    --        |   --       |     --       |

| **Test**  |     54.9       |     63.6       |    2233      | 7195       |     278      |

### Demo

To facilitate the application of TrackFormer, we provide a demo interface which allows for a quick processing of a given video sequence.

```

ffmpeg -i data/snakeboard/snakeboard.mp4 -vf fps=30 data/snakeboard/%06d.png

python src/track.py with \

    dataset_name=DEMO \

    data_root_dir=data/snakeboard \

    output_dir=data/snakeboard \

    write_images=pretty

```



    



## Publication

If you use this software in your research, please cite our publication:

```

@InProceedings{meinhardt2021trackformer,

    title={TrackFormer: Multi-Object Tracking with Transformers},

    author={Tim Meinhardt and Alexander Kirillov and Laura Leal-Taixe and Christoph Feichtenhofer},

    year={2022},

    month = {June},

    booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/timmeinhardt/trackformer

Awesome Lists containing this project

README