{"id":13626948,"url":"https://github.com/aimagelab/VKD","last_synced_at":"2025-04-16T19:30:58.032Z","repository":{"id":37643054,"uuid":"277597256","full_name":"aimagelab/VKD","owner":"aimagelab","description":"PyTorch code for ECCV 2020 paper: \"Robust Re-Identification by Multiple Views Knowledge Distillation\"","archived":false,"fork":false,"pushed_at":"2023-10-03T21:27:35.000Z","size":1640,"stargazers_count":73,"open_issues_count":4,"forks_count":14,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-04-11T14:43:24.655Z","etag":null,"topics":["deep-learning","eccv-2020","knowledge-distillation","re-id"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aimagelab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-07-06T16:45:11.000Z","updated_at":"2024-06-19T13:39:59.000Z","dependencies_parsed_at":"2024-01-14T06:07:18.397Z","dependency_job_id":"2ccc353b-f639-4c04-b34a-5975e0d092d0","html_url":"https://github.com/aimagelab/VKD","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aimagelab%2FVKD","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aimagelab%2FVKD/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aimagelab%2FVKD/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aimagelab%2FVKD/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aimagelab","download_url":"https://codeload.github.com/aimagelab/VKD/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249268547,"owners_count":21240940,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","eccv-2020","knowledge-distillation","re-id"],"created_at":"2024-08-01T22:00:27.005Z","updated_at":"2025-04-16T19:30:58.012Z","avatar_url":"https://github.com/aimagelab.png","language":"Python","funding_links":[],"categories":["Recent Papers"],"sub_categories":["**2020**"],"readme":"# Robust Re-Identification by Multiple Views Knowledge Distillation\n\nThis repository contains Pytorch code for the [ECCV20](https://eccv2020.eu/) paper \"Robust Re-Identification by Multiple Views Knowledge Distillation\" [[arXiv](http://arxiv.org/abs/2007.04174)]\n\n![VKD - Overview](images/mvd_framework.png)\n\n```bibtex\n@inproceedings{porrello2020robust,    \n    title={Robust Re-Identification by Multiple Views Knowledge Distillation},\n    author={Porrello, Angelo and Bergamini, Luca and Calderara, Simone},\n    booktitle={European Conference on Computer Vision},\n    pages={93--110},\n    year={2020},\n    organization={Springer}\n}\n```\n\n## Installation Note\n\nTested with Python3.6.8 on Ubuntu (17.04, 18.04).\n\n- Setup an empty pip environment \n- Install packages using ``pip install -r requirements.txt``\n- Install torch1.3.1 using ``pip install torch==1.3.1+cu92 torchvision==0.4.2+cu92 -f https://download.pytorch.org/whl/torch_stable.html\n``\n- Place datasets in ``.datasets/`` (Please note you may need do request some of them to their respective authors)\n- Run scripts from ```commands.txt```\n\nPlease note that if you're running the code from Pycharm (or another IDE) you may need to manually set the working path to ``PROJECT_PATH``\n\n## VKD Training (MARS [1])\n\n### Data preparation\n- Create the folder ``./datasets/mars``\n- Download the dataset from [here](https://drive.google.com/drive/u/1/folders/0B6tjyrV1YrHeMVV2UFFXQld6X1E)\n- Unzip data and place the two folders inside the MARS [1] folder\n- Download metadata from [here](https://github.com/liangzheng06/MARS-evaluation/tree/master/info)\n- Place them in a folder named ``info`` under the same path\n- You should end up with the following structure:\n\n```\nPROJECT_PATH/datasets/mars/\n|-- bbox_train/\n|-- bbox_test/\n|-- info/\n```\n\n### Teacher-Student Training\n\n**First step**: the backbone network is trained for the standard Video-To-Video setting. In this stage, each training example comprises of N images drawn from the same tracklet (N=8 by default; you can change it through the argument ``--num_train_images``.\n\n```shell\n# To train ResNet-50 on MARS (teacher, first step) run:\npython ./tools/train_v2v.py mars --backbone resnet50 --num_train_images 8 --p 8 --k 4 --exp_name base_mars_resnet50 --first_milestone 100 --step_milestone 100\n```\n\n**Second step**: we appoint it as the teacher and freeze its parameters. Then, a new network with the role of the student is instantiated. In doing so, we feed N views (i.e. images captured from multiple cameras) as input to the teacher and ask the student to mimic the same outputs from fewer (M=2 by default,``--num_student_images``) frames.\n```shell\n# To train a ResVKD-50 (student) run:\npython ./tools/train_distill.py mars ./logs/base_mars_resnet50 --exp_name distill_mars_resnet50 --p 12 --k 4 --step_milestone 150 --num_epochs 500\n```\n\n![](images/mars_all_withstudent.png)\n\n## Model Zoo\n\nWe provide a bunch of pre-trained checkpoints through two zip files (``baseline.zip`` containing the weights of the teacher networks, ``distilled.zip`` the student ones). Therefore, to evaluate ResNet-50 and ResVKD-50 on MARS, proceed as follows:\n- Download ``baseline.zip`` from [here](https://ailb-web.ing.unimore.it/publicfiles/vkd_checkpoints/baseline.zip) and ``distilled.zip`` from [here](https://ailb-web.ing.unimore.it/publicfiles/vkd_checkpoints/distilled.zip) (~4.8 GB)\n- Unzip the two folders inside the ``PROJECT_PATH/logs`` folder\n- Then, you can evaluate both networks using the ``eval.py`` script:\n\n```sh\npython ./tools/eval.py mars ./logs/baseline_public/mars/base_mars_resnet50 --trinet_chk_name chk_end\n```\n\n```sh\npython ./tools/eval.py mars ./logs/distilled_public/mars/selfdistill/distill_mars_resnet50 --trinet_chk_name chk_di_1\n```\n\nYou should end up with the following results on MARS (see Tab.1 of the paper for VeRi-776 and Duke-Video-ReID):\n\nBackbone|top1 I2V|mAP I2V|top1 V2V|mAP V2V\n:-:|:-:|:-:|:-:|:-:\n``ResNet-34`` | 80.81 | 70.74 | 86.67 | 78.03 \n``ResVKD-34`` | **82.17** | **73.68** | **87.83** | **79.50**\n``ResNet-50`` | 82.22 | 73.38 | 87.88 | 81.13 \n``ResVKD-50`` | **83.89** | **77.27** | **88.74** | **82.22** \n``ResNet-101`` | 82.78 | 74.94 | 88.59 | 81.66 \n``ResVKD-101`` | **85.91** | **77.64** | **89.60** | **82.65** \n\nBackbone|top1 I2V|mAP I2V|top1 V2V|mAP V2V\n:-:|:-:|:-:|:-:|:-:\n``ResNet-50bam`` | 82.58 | 74.11 | 88.54 | 81.19 \n``ResVKD-50bam`` | **84.34** | **78.13** | **89.39** | **83.07** \n\nBackbone|top1 I2V|mAP I2V|top1 V2V|mAP V2V\n:-:|:-:|:-:|:-:|:-:\n``DenseNet-121`` | 82.68 | 74.34 | 89.75 | 81.93 \n``DenseVKD-121`` | **84.04** | **77.09** | **89.80** | **82.84** \n\nBackbone|top1 I2V|mAP I2V|top1 V2V|mAP V2V\n:-:|:-:|:-:|:-:|:-:\n``MobileNet-V2`` | 78.64 | 67.94 | 85.96 | 77.10 \n``MobileVKD-V2`` | **83.33** | **73.95** | **88.13** | **79.62**\n\n## Teacher-Student Explanations\n\nAs discussed in the main paper, we have leveraged GradCam [2] to highlight the input regions that have been considered paramount for predicting the identity. We have performed the same analysis for the teacher network as well as for the student one: as can be seen, the latter pays more attention to the subject of interest compared to its teacher.\n\n![Model Explanation](images/gradcam.png)\n\nYou can draw the heatmaps with the following command:\n\n```sh\npython -u ./tools/save_heatmaps.py mars \u003cpath-to-teacher-net\u003e --chk_net1 \u003cteacher-checkpoint-name\u003e \u003cpath-to-student-net\u003e --chk_net2 \u003cstudent-checkpoint-name\u003e --dest_path \u003coutput-dir\u003e\n```\n\n## References\n\n1. Zheng, L., Bie, Z., Sun, Y., Wang, J., Su, C., Wang, S., Tian, Q.: Mars: A video benchmark for large-scale person re-identification. In: European Conference on Computer Vision (2016)\n2. Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., \u0026 Batra, D. (2017). Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision (pp. 618-626).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faimagelab%2FVKD","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faimagelab%2FVKD","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faimagelab%2FVKD/lists"}