{"id":15175854,"url":"https://github.com/sforaidl/kd_lib","last_synced_at":"2025-05-16T13:08:15.281Z","repository":{"id":37457398,"uuid":"262789449","full_name":"SforAiDl/KD_Lib","owner":"SforAiDl","description":"A Pytorch Knowledge Distillation library for benchmarking and extending works in the domains of Knowledge Distillation, Pruning, and Quantization.","archived":false,"fork":false,"pushed_at":"2023-03-01T21:06:37.000Z","size":23276,"stargazers_count":627,"open_issues_count":19,"forks_count":60,"subscribers_count":14,"default_branch":"master","last_synced_at":"2025-05-16T13:08:12.347Z","etag":null,"topics":["algorithm-implementations","benchmarking","data-science","deep-learning-library","knowledge-distillation","machine-learning","model-compression","pruning","pytorch","quantization"],"latest_commit_sha":null,"homepage":"https://kd-lib.readthedocs.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SforAiDl.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.rst","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2020-05-10T13:08:42.000Z","updated_at":"2025-05-15T09:09:29.000Z","dependencies_parsed_at":"2024-01-12T21:18:01.301Z","dependency_job_id":"5bcf156b-6cd3-4f0e-a6ef-408e34bb1fe2","html_url":"https://github.com/SforAiDl/KD_Lib","commit_stats":{"total_commits":268,"total_committers":9,"mean_commits":29.77777777777778,"dds":0.5373134328358209,"last_synced_commit":"a94d236050e81965d7e0c03d7dee3b63e67b675a"},"previous_names":[],"tags_count":31,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SforAiDl%2FKD_Lib","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SforAiDl%2FKD_Lib/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SforAiDl%2FKD_Lib/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SforAiDl%2FKD_Lib/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SforAiDl","download_url":"https://codeload.github.com/SforAiDl/KD_Lib/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254535828,"owners_count":22087399,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["algorithm-implementations","benchmarking","data-science","deep-learning-library","knowledge-distillation","machine-learning","model-compression","pruning","pytorch","quantization"],"created_at":"2024-09-27T12:43:14.713Z","updated_at":"2025-05-16T13:08:15.264Z","avatar_url":"https://github.com/SforAiDl.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ch1 align=\"center\"\u003eKD-Lib\u003c/h1\u003e\n\u003ch3 align=\"center\"\u003eA PyTorch model compression library containing easy-to-use methods for knowledge distillation, pruning, and quantization\u003c/h3\u003e\n\n\u003cdiv align='center'\u003e\n\n[![Downloads](https://pepy.tech/badge/kd-lib)](https://pepy.tech/project/kd-lib)\n[![Tests](https://github.com/SforAiDl/KD_Lib/actions/workflows/python-package-test.yml/badge.svg)](https://github.com/SforAiDl/KD_Lib/actions/workflows/python-package-test.yml)\n[![Docs](https://readthedocs.org/projects/kd-lib/badge/?version=latest)](https://kd-lib.readthedocs.io/en/latest/?badge=latest)\n\n**[Documentation](https://kd-lib.readthedocs.io/en/latest/)** | **[Tutorials](https://kd-lib.readthedocs.io/en/latest/usage/tutorials/index.html)**\n\n\u003c/div\u003e\n\n## Installation\n\n### From source (recommended)\n\n```shell\n\nhttps://github.com/SforAiDl/KD_Lib.git\ncd KD_Lib\npython setup.py install\n\n```\n\n### From PyPI\n\n```shell\n\npip install KD-Lib\n\n```\n\n## Example usage\n\nTo implement the most basic version of knowledge distillation from [Distilling the Knowledge in a Neural Network](https://arxiv.org/abs/1503.02531) and plot loss curves:\n\n```python\n\nimport torch\nimport torch.optim as optim\nfrom torchvision import datasets, transforms\nfrom KD_Lib.KD import VanillaKD\n\n# This part is where you define your datasets, dataloaders, models and optimizers\n\ntrain_loader = torch.utils.data.DataLoader(\n    datasets.MNIST(\n        \"mnist_data\",\n        train=True,\n        download=True,\n        transform=transforms.Compose(\n            [transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))]\n        ),\n    ),\n    batch_size=32,\n    shuffle=True,\n)\n\ntest_loader = torch.utils.data.DataLoader(\n    datasets.MNIST(\n        \"mnist_data\",\n        train=False,\n        transform=transforms.Compose(\n            [transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))]\n        ),\n    ),\n    batch_size=32,\n    shuffle=True,\n)\n\nteacher_model = \u003cyour model\u003e\nstudent_model = \u003cyour model\u003e\n\nteacher_optimizer = optim.SGD(teacher_model.parameters(), 0.01)\nstudent_optimizer = optim.SGD(student_model.parameters(), 0.01)\n\n# Now, this is where KD_Lib comes into the picture\n\ndistiller = VanillaKD(teacher_model, student_model, train_loader, test_loader, \n                      teacher_optimizer, student_optimizer)  \ndistiller.train_teacher(epochs=5, plot_losses=True, save_model=True)    # Train the teacher network\ndistiller.train_student(epochs=5, plot_losses=True, save_model=True)    # Train the student network\ndistiller.evaluate(teacher=False)                                       # Evaluate the student network\ndistiller.get_parameters()                                              # A utility function to get the number of \n                                                                        # parameters in the  teacher and the student network\n\n```\n\nTo train a collection of 3 models in an online fashion using the framework in [Deep Mutual Learning](https://arxiv.org/abs/1706.00384)\nand log training details to Tensorboard: \n\n```python\n\nimport torch\nimport torch.optim as optim\nfrom torchvision import datasets, transforms\nfrom KD_Lib.KD import DML\nfrom KD_Lib.models import ResNet18, ResNet50          # To use models packaged in KD_Lib\n\n# Define your datasets, dataloaders, models and optimizers\n\ntrain_loader = torch.utils.data.DataLoader(\n    datasets.MNIST(\n        \"mnist_data\",\n        train=True,\n        download=True,\n        transform=transforms.Compose(\n            [transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))]\n        ),\n    ),\n    batch_size=32,\n    shuffle=True,\n)\n\ntest_loader = torch.utils.data.DataLoader(\n    datasets.MNIST(\n        \"mnist_data\",\n        train=False,\n        transform=transforms.Compose(\n            [transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,))]\n        ),\n    ),\n    batch_size=32,\n    shuffle=True,\n)\n\nstudent_params = [4, 4, 4, 4, 4]\nstudent_model_1 = ResNet50(student_params, 1, 10)\nstudent_model_2 = ResNet18(student_params, 1, 10)\n\nstudent_cohort = [student_model_1, student_model_2]\n\nstudent_optimizer_1 = optim.SGD(student_model_1.parameters(), 0.01)\nstudent_optimizer_2 = optim.SGD(student_model_2.parameters(), 0.01)\n\nstudent_optimizers = [student_optimizer_1, student_optimizer_2]\n\n# Now, this is where KD_Lib comes into the picture \n\ndistiller = DML(student_cohort, train_loader, test_loader, student_optimizers, log=True, logdir=\"./logs\")\n\ndistiller.train_students(epochs=5)\ndistiller.evaluate()\ndistiller.get_parameters()\n\n```\n\n## Methods Implemented\n\nSome benchmark results can be found in the [logs](./logs.rst) file.\n\n|  Paper / Method                                           |  Link                            | Repository (KD_Lib/) |\n| ----------------------------------------------------------|----------------------------------|----------------------|\n| Distilling the Knowledge in a Neural Network              | https://arxiv.org/abs/1503.02531 | KD/vision/vanilla    |\n| Improved Knowledge Distillation via Teacher Assistant     | https://arxiv.org/abs/1902.03393 | KD/vision/TAKD       |\n| Relational Knowledge Distillation                         | https://arxiv.org/abs/1904.05068 | KD/vision/RKD        |\n| Distilling Knowledge from Noisy Teachers                  | https://arxiv.org/abs/1610.09650 | KD/vision/noisy      |\n| Paying More Attention To The Attention                    | https://arxiv.org/abs/1612.03928 | KD/vision/attention  |\n| Revisit Knowledge Distillation: a Teacher-free \u003cbr\u003e Framework  | https://arxiv.org/abs/1909.11723 |KD/vision/teacher_free|\n| Mean Teachers are Better Role Models                      | https://arxiv.org/abs/1703.01780 |KD/vision/mean_teacher|\n| Knowledge Distillation via Route Constrained \u003cbr\u003e Optimization | https://arxiv.org/abs/1904.09149 | KD/vision/RCO        |\n| Born Again Neural Networks                                | https://arxiv.org/abs/1805.04770 | KD/vision/BANN       |\n| Preparing Lessons: Improve Knowledge Distillation \u003cbr\u003e with Better Supervision | https://arxiv.org/abs/1911.07471 | KD/vision/KA |\n| Improving Generalization Robustness with Noisy \u003cbr\u003e Collaboration in Knowledge Distillation | https://arxiv.org/abs/1910.05057 | KD/vision/noisy|\n| Distilling Task-Specific Knowledge from BERT into \u003cbr\u003e Simple Neural Networks | https://arxiv.org/abs/1903.12136 | KD/text/BERT2LSTM |\n| Deep Mutual Learning                                      | https://arxiv.org/abs/1706.00384 | KD/vision/DML        |\n| The Lottery Ticket Hypothesis: Finding Sparse, \u003cbr\u003e Trainable Neural Networks | https://arxiv.org/abs/1803.03635 | Pruning/lottery_tickets|\n| Regularizing Class-wise Predictions via \u003cbr\u003e Self-knowledge Distillation | https://arxiv.org/abs/2003.13964 | KD/vision/CSDK |\n\n\u003cbr\u003e\n\nPlease cite our pre-print if you find `KD-Lib` useful in any way :)\n\n```bibtex\n\n@misc{shah2020kdlib,\n  title={KD-Lib: A PyTorch library for Knowledge Distillation, Pruning and Quantization}, \n  author={Het Shah and Avishree Khare and Neelay Shah and Khizir Siddiqui},\n  year={2020},\n  eprint={2011.14691},\n  archivePrefix={arXiv},\n  primaryClass={cs.LG}\n}\n\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsforaidl%2Fkd_lib","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsforaidl%2Fkd_lib","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsforaidl%2Fkd_lib/lists"}