{"id":13521378,"url":"https://github.com/goutamyg/MVT","last_synced_at":"2025-03-31T20:31:22.215Z","repository":{"id":189977703,"uuid":"681699692","full_name":"goutamyg/MVT","owner":"goutamyg","description":"[BMVC 2023] Mobile Vision Transformer-based Visual Object Tracking","archived":false,"fork":false,"pushed_at":"2024-04-23T21:51:30.000Z","size":23968,"stargazers_count":20,"open_issues_count":5,"forks_count":3,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-11-02T05:32:52.272Z","etag":null,"topics":["bmvc","bmvc2023","mobile-vision-transformer","single-object-tracking","vision-transformer","visual-object-tracking","visual-tracking"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/goutamyg.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-08-22T15:06:39.000Z","updated_at":"2024-05-09T22:06:29.000Z","dependencies_parsed_at":"2023-08-22T19:13:16.985Z","dependency_job_id":"b359c003-db6a-4a1e-83d6-e8c5f6331a72","html_url":"https://github.com/goutamyg/MVT","commit_stats":null,"previous_names":["goutamyg/mvt"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/goutamyg%2FMVT","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/goutamyg%2FMVT/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/goutamyg%2FMVT/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/goutamyg%2FMVT/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/goutamyg","download_url":"https://codeload.github.com/goutamyg/MVT/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246535898,"owners_count":20793346,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bmvc","bmvc2023","mobile-vision-transformer","single-object-tracking","vision-transformer","visual-object-tracking","visual-tracking"],"created_at":"2024-08-01T06:00:33.601Z","updated_at":"2025-03-31T20:31:21.658Z","avatar_url":"https://github.com/goutamyg.png","language":"Python","funding_links":[],"categories":["Papers"],"sub_categories":["BMCV 2023"],"readme":"# [Mobile Vision Transformer-based Visual Object Tracking](https://papers.bmvc2023.org/0800.pdf) [BMVC2023] official implementation\n![MVT_block](assets/MVT.png)\n\n## News\n**`11-03-2024`**: C++ implementation of our tracker is [available now](https://github.com/goutamyg/MVT.cpp/tree/main)\n\n**`10-11-2023`**: ONNX-Runtime and TensorRT-based inference code is released. Now, our ***MVT*** runs at ~70 *fps* on CPU and ~300 *fps* on GPU :zap::zap:. Check the [page](https://github.com/goutamyg/MVT/blob/multi_framework_inference/lib/tutorial/MVT_ONNX_TRT_Tutorial.md) for details.\n\n**`14-09-2023`**: The pretrained tracker model is released\n\n**`13-09-2023`**: The paper is available on [arXiv](https://arxiv.org/abs/2309.05829) now\n\n**`22-08-2023`**: The MVT tracker training and inference code is released\n\n**`21-08-2023`**: The paper is accepted at BMVC2023\n\n## Installation\n\nInstall the dependency packages using the environment file `mvt_pyenv.yml`.\n\nGenerate the relevant files:\n```\npython tracking/create_default_local_file.py --workspace_dir . --data_dir ./data --save_dir ./output\n```\nAfter running this command, modify the datasets paths by editing these files\n```\nlib/train/admin/local.py  # paths about training\nlib/test/evaluation/local.py  # paths about testing\n```\n\n## Training\n\n* Set the path of training datasets in `lib/train/admin/local.py`\n* Place the pretrained backbone model under the `pretrained_models/` folder\n* For data preparation, please refer to [this](https://github.com/botaoye/OSTrack/tree/main)\n* Uncomment lines `63, 67, and 71` in the [base_backbone.py](https://github.com/goutamyg/MVT/blob/main/lib/models/mobilevit_track/base_backbone.py) file. Replace [these lines](https://github.com/goutamyg/MVT/blob/main/lib/test/tracker/mobilevit_track.py#L68-L78) with ```self.z_dict1 = template.tensors```.\n* Run\n```\npython tracking/train.py --script mobilevit_track --config mobilevit_256_128x1_got10k_ep100_cosine_annealing --save_dir ./output --mode single\n```\n* The training logs will be saved under `output/logs/` folder\n\n## Pretrained tracker model\nThe pretrained tracker model can be found [here](https://drive.google.com/drive/folders/1RAdn3ZXI_G7pBj4NDbtQVFPkClVd1IBm)\n\n## Tracker Evaluation\n\n* Update the test dataset paths in `lib/test/evaluation/local.py`\n* Place the [pretrained tracker model](https://drive.google.com/drive/folders/1RAdn3ZXI_G7pBj4NDbtQVFPkClVd1IBm) under `output/checkpoints/` folder \n* Run\n```\npython tracking/test.py --tracker_name mobilevit_track --tracker_param mobilevit_256_128x1_got10k_ep100_cosine_annealing --dataset got10k_test/trackingnet/lasot\n```\n* Change the `DEVICE` variable between `cuda` and `cpu` in the `--tracker_param` file for GPU and CPU-based inference, respectively  \n* The raw results will be stored under `output/test/` folder\n\n## Profile tracker model\n* To count the model parameters, run\n```\npython tracking/profile_model.py\n```\n\n## Acknowledgements\n* We use the Separable Self-Attention Transformer implementation and the pretrained `MobileViT` backbone from [ml-cvnets](https://github.com/apple/ml-cvnets). Thank you!\n* Our training code is built upon [OSTrack](https://github.com/botaoye/OSTrack) and [PyTracking](https://github.com/visionml/pytracking)\n\n## Citation\nIf our work is useful for your research, please consider citing:\n\n```Bibtex\n@inproceedings{Gopal_2023_BMVC,\nauthor    = {Goutam Yelluru Gopal and Maria Amer},\ntitle     = {Mobile Vision Transformer-based Visual Object Tracking},\nbooktitle = {34th British Machine Vision Conference 2023, {BMVC} 2023, Aberdeen, UK, November 20-24, 2023},\npublisher = {BMVA},\nyear      = {2023},\nurl       = {https://papers.bmvc2023.org/0800.pdf}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoutamyg%2FMVT","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgoutamyg%2FMVT","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoutamyg%2FMVT/lists"}