{"id":13441753,"url":"https://github.com/visionml/pytracking","last_synced_at":"2025-05-14T17:09:06.259Z","repository":{"id":37664268,"uuid":"179265268","full_name":"visionml/pytracking","owner":"visionml","description":"Visual tracking library based on PyTorch.","archived":false,"fork":false,"pushed_at":"2024-08-08T20:25:00.000Z","size":5083,"stargazers_count":3356,"open_issues_count":78,"forks_count":605,"subscribers_count":82,"default_branch":"master","last_synced_at":"2025-04-11T10:00:08.538Z","etag":null,"topics":["computer-vision","machine-learning","tracking","visual-tracking"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/visionml.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-04-03T10:17:53.000Z","updated_at":"2025-04-10T14:09:24.000Z","dependencies_parsed_at":"2023-01-30T22:46:48.624Z","dependency_job_id":"6ea0a9b9-3b6b-4012-b314-199ef6446369","html_url":"https://github.com/visionml/pytracking","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/visionml%2Fpytracking","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/visionml%2Fpytracking/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/visionml%2Fpytracking/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/visionml%2Fpytracking/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/visionml","download_url":"https://codeload.github.com/visionml/pytracking/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254190396,"owners_count":22029632,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-vision","machine-learning","tracking","visual-tracking"],"created_at":"2024-07-31T03:01:37.669Z","updated_at":"2025-05-14T17:09:06.239Z","avatar_url":"https://github.com/visionml.png","language":"Python","readme":"# PyTracking\nA general python framework for visual object tracking and video object segmentation, based on **PyTorch**.\n\n### :fire: One tracking paper accepted at WACV 2024! 👇\n* [Beyond SOT: Tracking Multiple Generic Objects at Once](https://arxiv.org/abs/2212.11920) | **Code available!**\n\n\n### :fire: One tracking paper accepted at WACV 2023! 👇\n* [Efficient Visual Tracking with Exemplar Transformers](https://arxiv.org/abs/2112.09686) | **Code available!**\n\n### :fire: One tracking paper accepted at ECCV 2022! 👇\n* [Robust Visual Tracking by Segmentation](https://arxiv.org/abs/2203.11191) | **Code available!**\n\n\n## Highlights\n\n### TaMOs, RTS, ToMP, KeepTrack, LWL, KYS, PrDiMP, DiMP and ATOM Trackers\n\nOfficial implementation of the **TaMOs** (WACV  2024), **RTS** (ECCV 2022), **ToMP** (CVPR 2022), **KeepTrack** (ICCV 2021), **LWL** (ECCV 2020), **KYS** (ECCV 2020), **PrDiMP** (CVPR 2020),\n**DiMP** (ICCV 2019), and **ATOM** (CVPR 2019) trackers, including complete **training code** and trained models.\n\n### [Tracking Libraries](pytracking)\n\nLibraries for implementing and evaluating visual trackers. It includes\n\n* All common **tracking** and **video object segmentation** datasets.  \n* Scripts to **analyse** tracker performance and obtain standard performance scores.\n* General building blocks, including **deep networks**, **optimization**, **feature extraction** and utilities for **correlation filter** tracking.  \n\n### [Training Framework: LTR](ltr)\n \n**LTR** (Learning Tracking Representations) is a general framework for training your visual tracking networks. It is equipped with\n\n* All common **training datasets** for visual object tracking and segmentation.  \n* Functions for data **sampling**, **processing** etc.  \n* Network **modules** for visual tracking.\n* And much more...\n\n\n### [Model Zoo](MODEL_ZOO.md)\nThe tracker models trained using PyTracking, along with their results on standard tracking \nbenchmarks are provided in the [model zoo](MODEL_ZOO.md). \n\n\n## Trackers\nThe toolkit contains the implementation of the following trackers.\n\n### TaMOs (WACV 2024)\n\n**[[Paper]](https://arxiv.org/abs/2212.11920) [[Raw results]](MODEL_ZOO.md#Raw-Results-1)\n[[Models]](MODEL_ZOO.md#Models-1) [[Training Code]](./ltr/README.md#TaMOs)  [[Tracker Code]](./pytracking/README.md#TaMOs)**\n\nOfficial implementation of **TaMOs**. TaMOs is the first generico object tracker to tackle the problem of tracking multiple\ngeneric object at once. It uses a shared model predictor consisting of a Transformer in order to produce multiple\ntarget models (one for each specified target). It achieves sub-linear run-time when tracking multiple objects and\noutperforms existing single object trackers when running one instance for each target separately.\nTaMOs serves as the baseline tracker for the new large-scale generic object tracking  benchmark LaGOT  (see [here](https://github.com/google-research-datasets/LaGOT))\nthat contains multiple annotated target objects per sequence.\n\n![TaMOs_teaser_figure](pytracking/.figs/TaMOs_overview.png)\n\n### RTS (ECCV 2022)\n\n**[[Paper]](https://arxiv.org/abs/2203.11191) [[Raw results]](MODEL_ZOO.md#Raw-Results-1)\n[[Models]](MODEL_ZOO.md#Models-1) [[Training Code]](./ltr/README.md#RTS)  [[Tracker Code]](./pytracking/README.md#RTS)**\n\nOfficial implementation of **RTS**. RTS is a robust, end-to-end trainable, segmentation-centric pipeline that internally\nworks with segmentation masks instead of bounding boxes. Thus, it can learn a better target representation that clearly\ndifferentiates the target from the background. To achieve the necessary robustness for challenging tracking scenarios,\na separate instance localization component is used to condition the segmentation decoder when producing the output mask.\n\n![RTS_teaser_figure](pytracking/.figs/rts_overview.png)\n\n### ToMP (CVPR 2022)\n\n**[[Paper]](https://arxiv.org/abs/2203.11192) [[Raw results]](MODEL_ZOO.md#Raw-Results-1)\n  [[Models]](MODEL_ZOO.md#Models-1) [[Training Code]](./ltr/README.md#ToMP)  [[Tracker Code]](./pytracking/README.md#ToMP)**\n\nOfficial implementation of **ToMP**. ToMP employs a Transformer-based \nmodel prediction module in order to localize the target. The model predictor is further extended to estimate a second set\nof weights that are applied for accurate bounding box regression.\nThe resulting tracker ToMP relies on training and on test frame information in order to predict all weights transductively.\n\n![ToMP_teaser_figure](pytracking/.figs/ToMP_teaser.png)\n\n### KeepTrack (ICCV 2021)\n\n**[[Paper]](https://arxiv.org/abs/2103.16556)  [[Raw results]](MODEL_ZOO.md#Raw-Results-1)\n  [[Models]](MODEL_ZOO.md#Models-1)  [[Training Code]](./ltr/README.md#KeepTrack)  [[Tracker Code]](./pytracking/README.md#KeepTrack)**\n\nOfficial implementation of **KeepTrack**. KeepTrack actively handles distractor objects to\ncontinue tracking the target. It employs a learned target candidate association network, that\nallows to propagate the identities of all target candidates from frame-to-frame.\nTo tackle the problem of lacking groundtruth correspondences between distractor objects in visual tracking,\nit uses a training strategy that combines partial annotations with self-supervision. \n\n![KeepTrack_teaser_figure](pytracking/.figs/KeepTrack_teaser.png)\n\n\n### LWL (ECCV 2020)\n**[[Paper]](https://arxiv.org/pdf/2003.11540.pdf)  [[Raw results]](MODEL_ZOO.md#Raw-Results-1)\n  [[Models]](MODEL_ZOO.md#Models-1)  [[Training Code]](./ltr/README.md#LWL)  [[Tracker Code]](./pytracking/README.md#LWL)**\n    \nOfficial implementation of the **LWL** tracker. LWL is an end-to-end trainable video object segmentation architecture\nwhich captures the current target object information in a compact parametric\nmodel. It integrates a differentiable few-shot learner module, which predicts the\ntarget model parameters using the first frame annotation. The learner is designed\nto explicitly optimize an error between target model prediction and a ground\ntruth label. LWL further learns the ground-truth labels used by the\nfew-shot learner to train the target model. All modules in the architecture are trained end-to-end by maximizing segmentation accuracy on annotated VOS videos. \n\n![LWL overview figure](pytracking/.figs/lwtl_overview.png)\n\n### KYS (ECCV 2020)\n**[[Paper]](https://arxiv.org/pdf/2003.11014.pdf)  [[Raw results]](MODEL_ZOO.md#Raw-Results)\n  [[Models]](MODEL_ZOO.md#Models)  [[Training Code]](./ltr/README.md#KYS)  [[Tracker Code]](./pytracking/README.md#KYS)**\n    \nOfficial implementation of the **KYS** tracker. Unlike conventional frame-by-frame detection based tracking, KYS \npropagates valuable scene information through the sequence. This information is used to\nachieve an improved scene-aware target prediction in each frame. The scene information is represented using a dense \nset of localized state vectors. These state vectors are propagated through the sequence and combined with the appearance\nmodel output to localize the target. The network is learned to effectively utilize the scene information by directly maximizing tracking performance on video segments\n![KYS overview figure](pytracking/.figs/kys_overview.png)\n\n### PrDiMP (CVPR 2020)\n**[[Paper]](https://arxiv.org/pdf/2003.12565)  [[Raw results]](MODEL_ZOO.md#Raw-Results)\n  [[Models]](MODEL_ZOO.md#Models)  [[Training Code]](./ltr/README.md#PrDiMP)  [[Tracker Code]](./pytracking/README.md#DiMP)**\n    \nOfficial implementation of the **PrDiMP** tracker. This work proposes a general \nformulation for probabilistic regression, which is then applied to visual tracking in the DiMP framework.\nThe network predicts the conditional probability density of the target state given an input image.\nThe probability density is flexibly parametrized by the neural network itself.\nThe regression network is trained by directly minimizing the Kullback-Leibler divergence. \n\n### DiMP (ICCV 2019)\n**[[Paper]](https://arxiv.org/pdf/1904.07220)  [[Raw results]](MODEL_ZOO.md#Raw-Results)\n  [[Models]](MODEL_ZOO.md#Models)  [[Training Code]](./ltr/README.md#DiMP)  [[Tracker Code]](./pytracking/README.md#DiMP)**\n    \nOfficial implementation of the **DiMP** tracker. DiMP is an end-to-end tracking architecture, capable\nof fully exploiting both target and background appearance\ninformation for target model prediction. It is based on a target model prediction network, which is derived from a discriminative\nlearning loss by applying an iterative optimization procedure. The model prediction network employs a steepest descent \nbased methodology that computes an optimal step length in each iteration to provide fast convergence. The model predictor also\nincludes an initializer network that efficiently provides an initial estimate of the model weights.  \n\n![DiMP overview figure](pytracking/.figs/dimp_overview.png)\n \n### ATOM (CVPR 2019)\n**[[Paper]](https://arxiv.org/pdf/1811.07628)  [[Raw results]](MODEL_ZOO.md#Raw-Results)\n  [[Models]](MODEL_ZOO.md#Models)  [[Training Code]](./ltr/README.md#ATOM)  [[Tracker Code]](./pytracking/README.md#ATOM)**  \n \nOfficial implementation of the **ATOM** tracker. ATOM is based on \n(i) a **target estimation** module that is trained offline, and (ii) **target classification** module that is \ntrained online. The target estimation module is trained to predict the intersection-over-union (IoU) overlap \nbetween the target and a bounding box estimate. The target classification module is learned online using dedicated \noptimization techniques to discriminate between the target object and background.\n \n![ATOM overview figure](pytracking/.figs/atom_overview.png)\n \n### ECO/UPDT (CVPR 2017/ECCV 2018)\n**[[Paper]](https://arxiv.org/pdf/1611.09224.pdf)  [[Models]](https://drive.google.com/open?id=1aWC4waLv_te-BULoy0k-n_zS-ONms21S)  [[Tracker Code]](./pytracking/README.md#ECO)**  \n\nAn unofficial implementation of the **ECO** tracker. It is implemented based on an extensive and general library for [complex operations](pytracking/libs/complex.py) and [Fourier tools](pytracking/libs/fourier.py). The implementation differs from the version used in the original paper in a few important aspects. \n1. This implementation uses features from vgg-m layer 1 and resnet18 residual block 3.   \n2. As in our later [UPDT tracker](https://arxiv.org/pdf/1804.06833.pdf), seperate filters are trained for shallow and deep features, and extensive data augmentation is employed in the first frame.  \n3. The GMM memory module is not implemented, instead the raw projected samples are stored.  \n\nPlease refer to the [official implementation of ECO](https://github.com/martin-danelljan/ECO) if you are looking to reproduce the results in the ECO paper or download the raw results.\n\n## Associated trackers\nWe list associated trackers that can be found in external repositories.  \n\n### E.T.Track (WACV 2023)\n\n**[[Paper]](https://arxiv.org/abs/2112.09686) [[Code]](https://github.com/pblatter/ettrack)**\n\nOfficial implementation of **E.T.Track**. E.T.Track utilized our proposed Exemplar Transformer, a transformer module \nutilizing a single instance level attention layer for realtime visual object tracking. E.T.Track is up to 8x faster than \nother transformer-based models, and consistently outperforms competing lightweight trackers that can operate in realtime \non standard CPUs. \n\n![ETTrack_teaser_figure](pytracking/.figs/ETTrack_overview.png)\n\n## Installation\n\n#### Clone the GIT repository.  \n```bash\ngit clone https://github.com/visionml/pytracking.git\n```\n   \n#### Clone the submodules.  \nIn the repository directory, run the commands:  \n```bash\ngit submodule update --init  \n```  \n#### Install dependencies\nRun the installation script to install all the dependencies. You need to provide the conda install path (e.g. ~/anaconda3) and the name for the created conda environment (here ```pytracking```).  \n```bash\nbash install.sh conda_install_path pytracking\n```  \nThis script will also download the default networks and set-up the environment.  \n\n**Note:** The install script has been tested on an Ubuntu 18.04 system. In case of issues, check the [detailed installation instructions](INSTALL.md). \n\n**Windows:** (NOT Recommended!) Check [these installation instructions](INSTALL_win.md). \n\n#### Let's test it!\nActivate the conda environment and run the script pytracking/run_webcam.py to run ATOM using the webcam input.  \n```bash\nconda activate pytracking\ncd pytracking\npython run_webcam.py dimp dimp50    \n```  \n\n\n## What's next?\n\n#### [pytracking](pytracking) - for implementing your tracker\n\n#### [ltr](ltr) - for training your tracker\n\n## Contributors\n\n### Main Contributors\n* [Martin Danelljan](https://martin-danelljan.github.io/)  \n* [Goutam Bhat](https://goutamgmb.github.io/)\n* [Christoph Mayer](https://2006pmach.github.io/)\n* [Matthieu Paul](https://github.com/mattpfr)\n\n### Guest Contributors\n* [Felix Järemo-Lawin](https://liu.se/en/employee/felja34) [LWL]\n\n## Acknowledgments\n* Thanks for the great [PreciseRoIPooling](https://github.com/vacancy/PreciseRoIPooling) module.  \n* We use the implementation of the Lovász-Softmax loss from https://github.com/bermanmaxim/LovaszSoftmax.  \n","funding_links":[],"categories":["Python",":star2:Recommendation","Papers","Robotics","Sensor Processing"],"sub_categories":["Helpful Learning Resource for Tracking:thumbsup::thumbsup::thumbsup:","ICCV 2021","Image Processing"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvisionml%2Fpytracking","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvisionml%2Fpytracking","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvisionml%2Fpytracking/lists"}