{"id":15029478,"url":"https://github.com/haitongli/knowledge-distillation-pytorch","last_synced_at":"2025-05-15T17:03:34.177Z","repository":{"id":37664497,"uuid":"124605846","full_name":"haitongli/knowledge-distillation-pytorch","owner":"haitongli","description":"A PyTorch implementation for exploring deep and shallow knowledge distillation (KD) experiments with flexibility","archived":false,"fork":false,"pushed_at":"2023-03-25T00:03:08.000Z","size":23147,"stargazers_count":1851,"open_issues_count":18,"forks_count":344,"subscribers_count":19,"default_branch":"master","last_synced_at":"2024-10-29T17:54:35.675Z","etag":null,"topics":["cifar10","computer-vision","dark-knowledge","deep-neural-networks","knowledge-distillation","model-compression","pytorch"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/haitongli.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2018-03-09T23:58:31.000Z","updated_at":"2024-10-26T06:28:23.000Z","dependencies_parsed_at":"2024-04-08T07:00:16.452Z","dependency_job_id":null,"html_url":"https://github.com/haitongli/knowledge-distillation-pytorch","commit_stats":null,"previous_names":["peterliht/knowledge-distillation-pytorch"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/haitongli%2Fknowledge-distillation-pytorch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/haitongli%2Fknowledge-distillation-pytorch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/haitongli%2Fknowledge-distillation-pytorch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/haitongli%2Fknowledge-distillation-pytorch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/haitongli","download_url":"https://codeload.github.com/haitongli/knowledge-distillation-pytorch/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254384982,"owners_count":22062422,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cifar10","computer-vision","dark-knowledge","deep-neural-networks","knowledge-distillation","model-compression","pytorch"],"created_at":"2024-09-24T20:10:47.028Z","updated_at":"2025-05-15T17:03:34.156Z","avatar_url":"https://github.com/haitongli.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# knowledge-distillation-pytorch\n* Exploring knowledge distillation of DNNs for efficient hardware solutions\n* Author: Haitong Li\n* Framework: PyTorch\n* Dataset: CIFAR-10\n\n\n## Features\n* A framework for exploring \"shallow\" and \"deep\" knowledge distillation (KD) experiments\n* Hyperparameters defined by \"params.json\" universally (avoiding long argparser commands)\n* Hyperparameter searching and result synthesizing (as a table)\n* Progress bar, tensorboard support, and checkpoint saving/loading (utils.py)\n* Pretrained teacher models available for download \n\n\n## Install\n* Clone the repo\n  ```\n  git clone https://github.com/peterliht/knowledge-distillation-pytorch.git\n  ```\n\n* Install the dependencies (including Pytorch)\n  ```\n  pip install -r requirements.txt\n  ```\n\n\n## Organizatoin:\n* ./train.py: main entrance for train/eval with or without KD on CIFAR-10\n* ./experiments/: json files for each experiment; dir for hypersearch\n* ./model/: teacher and student DNNs, knowledge distillation (KD) loss defination, dataloader \n\n\n## Key notes about usage for your experiments:\n\n* Download the zip file for pretrained teacher model checkpoints from [\"experiments.zip\"](https://drive.google.com/file/d/12slKl4Vc8SbozFlvb-ahoR95F5yCwB_K/view?usp=sharing)\n* Simply move the unzipped subfolders into 'knowledge-distillation-pytorch/experiments/' (replacing the existing ones if necessary; follow the default path naming)\n* Call train.py to start training 5-layer CNN with ResNet-18's dark knowledge, or training ResNet-18 with state-of-the-art deeper models distilled\n* Use search_hyperparams.py for hypersearch\n* Hyperparameters are defined in params.json files universally. Refer to the header of search_hyperparams.py for details\n\n\n## Train (dataset: CIFAR-10)\n\nNote: all the hyperparameters can be found and modified in 'params.json' under 'model_dir'\n\n-- Train a 5-layer CNN with knowledge distilled from a pre-trained ResNet-18 model\n```\npython train.py --model_dir experiments/cnn_distill\n```\n\n-- Train a ResNet-18 model with knowledge distilled from a pre-trained ResNext-29 teacher\n```\npython train.py --model_dir experiments/resnet18_distill/resnext_teacher\n```\n\n-- Hyperparameter search for a specified experiment ('parent_dir/params.json')\n```\npython search_hyperparams.py --parent_dir experiments/cnn_distill_alpha_temp\n```\n\n--Synthesize results of the recent hypersearch experiments\n```\npython synthesize_results.py --parent_dir experiments/cnn_distill_alpha_temp\n```\n\n\n## Results: \"Shallow\" and \"Deep\" Distillation\n\nQuick takeaways (more details to be added):\n\n* Knowledge distillation provides regularization for both shallow DNNs and state-of-the-art DNNs\n* Having unlabeled or partial dataset can benefit from dark knowledge of teacher models\n\n\n-**Knowledge distillation from ResNet-18 to 5-layer CNN**\n\n| Model                   | Dropout = 0.5      |  No Dropout        | \n| :------------------:    | :----------------: | :-----------------:|\n| 5-layer CNN             | 83.51%             |  84.74%            | \n| 5-layer CNN w/ ResNet18 | 84.49%             |  **85.69%**        |\n\n-**Knowledge distillation from deeper models to ResNet-18**\n\n\n|Model                      |  Test Accuracy|\n|:--------:                 |   :---------: |\n|Baseline ResNet-18         | 94.175%       |\n|+ KD WideResNet-28-10      | 94.333%       |\n|+ KD PreResNet-110         | 94.531%       |\n|+ KD DenseNet-100          | 94.729%       |\n|+ KD ResNext-29-8          | **94.788%**   |\n\n\n\n## References\n\nH. Li, \"Exploring knowledge distillation of Deep neural nets for efficient hardware solutions,\" [CS230 Report](http://cs230.stanford.edu/files_winter_2018/projects/6940224.pdf), 2018\n\nHinton, Geoffrey, Oriol Vinyals, and Jeff Dean. \"Distilling the knowledge in a neural network.\" arXiv preprint arXiv:1503.02531 (2015).\n\nRomero, A., Ballas, N., Kahou, S. E., Chassang, A., Gatta, C., \u0026 Bengio, Y. (2014). Fitnets: Hints for thin deep nets. arXiv preprint arXiv:1412.6550.\n\nhttps://github.com/cs230-stanford/cs230-stanford.github.io\n\nhttps://github.com/bearpaw/pytorch-classification\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhaitongli%2Fknowledge-distillation-pytorch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhaitongli%2Fknowledge-distillation-pytorch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhaitongli%2Fknowledge-distillation-pytorch/lists"}