{"id":13525954,"url":"https://github.com/szagoruyko/attention-transfer","last_synced_at":"2025-05-16T11:05:21.592Z","repository":{"id":47130607,"uuid":"79242355","full_name":"szagoruyko/attention-transfer","owner":"szagoruyko","description":"Improving Convolutional Networks via Attention Transfer (ICLR 2017)","archived":false,"fork":false,"pushed_at":"2018-07-11T11:49:59.000Z","size":454,"stargazers_count":1453,"open_issues_count":13,"forks_count":275,"subscribers_count":50,"default_branch":"master","last_synced_at":"2025-04-09T06:08:27.053Z","etag":null,"topics":["attention","deep-learning","knowledge-distillation","pytorch"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/1612.03928","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/szagoruyko.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-01-17T15:38:09.000Z","updated_at":"2025-03-29T10:28:03.000Z","dependencies_parsed_at":"2022-08-29T10:10:15.888Z","dependency_job_id":null,"html_url":"https://github.com/szagoruyko/attention-transfer","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/szagoruyko%2Fattention-transfer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/szagoruyko%2Fattention-transfer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/szagoruyko%2Fattention-transfer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/szagoruyko%2Fattention-transfer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/szagoruyko","download_url":"https://codeload.github.com/szagoruyko/attention-transfer/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254518384,"owners_count":22084374,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["attention","deep-learning","knowledge-distillation","pytorch"],"created_at":"2024-08-01T06:01:23.784Z","updated_at":"2025-05-16T11:05:16.584Z","avatar_url":"https://github.com/szagoruyko.png","language":"Jupyter Notebook","funding_links":[],"categories":["Jupyter Notebook","PyTorch","Paper implementations｜论文实现","Paper implementations","Paper Implementations","Knowledge Distillation"],"sub_categories":["Other libraries｜其他库:","Other libraries:"],"readme":"Attention Transfer\n==============\n\nPyTorch code for \"Paying More Attention to Attention: Improving the Performance of\nConvolutional Neural Networks via Attention Transfer\" \u003chttps://arxiv.org/abs/1612.03928\u003e\u003cbr\u003e\nConference paper at ICLR2017: https://openreview.net/forum?id=Sks9_ajex\n\n\u003cimg src=https://cloud.githubusercontent.com/assets/4953728/22037632/04f54a7e-dd09-11e6-9a6b-62133fbc1c29.png width=25%\u003e\u003cimg src=https://cloud.githubusercontent.com/assets/4953728/22037801/d06c526a-dd09-11e6-8986-55c69493a075.png width=75%\u003e\n\n\nWhat's in this repo so far:\n * Activation-based AT code for CIFAR-10 experiments\n * Code for ImageNet experiments (ResNet-18-ResNet-34 student-teacher)\n * Jupyter notebook to visualize attention maps of ResNet-34 [visualize-attention.ipynb](visualize-attention.ipynb)\n\nComing:\n * grad-based AT\n * Scenes and CUB activation-based AT code\n\nThe code uses PyTorch \u003chttps://pytorch.org\u003e. Note that the original experiments were done\nusing [torch-autograd](https://github.com/twitter/torch-autograd), we have so far validated that CIFAR-10 experiments are\n*exactly* reproducible in PyTorch, and are in process of doing so for ImageNet (results are\nvery slightly worse in PyTorch, due to hyperparameters).\n\nbibtex:\n\n```\n@inproceedings{Zagoruyko2017AT,\n    author = {Sergey Zagoruyko and Nikos Komodakis},\n    title = {Paying More Attention to Attention: Improving the Performance of\n             Convolutional Neural Networks via Attention Transfer},\n    booktitle = {ICLR},\n    url = {https://arxiv.org/abs/1612.03928},\n    year = {2017}}\n```\n\n## Requirements\n\nFirst install [PyTorch](https://pytorch.org), then install [torchnet](https://github.com/pytorch/tnt):\n\n```\npip install git+https://github.com/pytorch/tnt.git@master\n```\n\nthen install other Python packages:\n\n```\npip install -r requirements.txt\n```\n\n## Experiments\n\n### CIFAR-10\n\nThis section describes how to get the results in the table 1 of the paper.\n\nFirst, train teachers:\n\n```\npython cifar.py --save logs/resnet_40_1_teacher --depth 40 --width 1\npython cifar.py --save logs/resnet_16_2_teacher --depth 16 --width 2\npython cifar.py --save logs/resnet_40_2_teacher --depth 40 --width 2\n```\n\nTo train with activation-based AT do:\n\n```\npython cifar.py --save logs/at_16_1_16_2 --teacher_id resnet_16_2_teacher --beta 1e+3\n```\n\nTo train with KD:\n\n```\npython cifar.py --save logs/kd_16_1_16_2 --teacher_id resnet_16_2_teacher --alpha 0.9\n```\n\nWe plan to add AT+KD with decaying `beta` to get the best knowledge transfer results soon.\n\n### ImageNet\n\n#### Pretrained model\n\nWe provide ResNet-18 pretrained model with activation based AT:\n\n| Model | val error |\n|:------|:---------:|\n|ResNet-18 | 30.4, 10.8 |\n|ResNet-18-ResNet-34-AT | 29.3, 10.0 |\n\nDownload link: \u003chttps://s3.amazonaws.com/modelzoo-networks/resnet-18-at-export.pth\u003e\n\nModel definition: \u003chttps://github.com/szagoruyko/functional-zoo/blob/master/resnet-18-at-export.ipynb\u003e\n\nConvergence plot:\n\n\u003cimg width=50% src=https://cloud.githubusercontent.com/assets/4953728/25014604/c768572e-2078-11e7-81b5-752124c1b423.png\u003e\n\n#### Train from scratch\n\nDownload pretrained weights for ResNet-34\n(see also [functional-zoo](https://github.com/szagoruyko/functional-zoo) for more\ninformation):\n\n```\nwget https://s3.amazonaws.com/modelzoo-networks/resnet-34-export.pth\n```\n\nPrepare the data following [fb.resnet.torch](https://github.com/facebook/fb.resnet.torch)\nand run training (e.g. using 2 GPUs):\n\n```\npython imagenet.py --imagenetpath ~/ILSVRC2012 --depth 18 --width 1 \\\n                   --teacher_params resnet-34-export.hkl --gpu_id 0,1 --ngpu 2 \\\n                   --beta 1e+3\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fszagoruyko%2Fattention-transfer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fszagoruyko%2Fattention-transfer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fszagoruyko%2Fattention-transfer/lists"}