{"id":14959053,"url":"https://github.com/sseung0703/ekg","last_synced_at":"2025-10-24T16:30:54.772Z","repository":{"id":50346559,"uuid":"466125752","full_name":"sseung0703/EKG","owner":"sseung0703","description":"Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning","archived":false,"fork":false,"pushed_at":"2022-09-20T15:18:14.000Z","size":44402,"stargazers_count":18,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-01-31T03:01:39.984Z","etag":null,"topics":["filter-pruning","tensorflow-examples"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sseung0703.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-03-04T13:04:37.000Z","updated_at":"2024-12-26T08:10:48.000Z","dependencies_parsed_at":"2023-01-18T16:15:48.555Z","dependency_job_id":null,"html_url":"https://github.com/sseung0703/EKG","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sseung0703%2FEKG","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sseung0703%2FEKG/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sseung0703%2FEKG/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sseung0703%2FEKG/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sseung0703","download_url":"https://codeload.github.com/sseung0703/EKG/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":237999554,"owners_count":19399903,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["filter-pruning","tensorflow-examples"],"created_at":"2024-09-24T13:18:46.351Z","updated_at":"2025-10-24T16:30:49.673Z","avatar_url":"https://github.com/sseung0703.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning\nThis repository is official Tensorflow implementation of paper:\n\nEnsemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning [[paper link](https://arxiv.org/abs/2203.02651)] \u003cbr/\u003e\n\nand Tensorflow 2 example code for \u003cbr\u003e\n\u0026nbsp;\u0026nbsp; \"Custom layers\", \"Custom training loop\", \"XLA (JIT)-compiling\", \"Distributed learing\", and \"Gradients accumulator\".\n\n## Paper abstract\nConventional NAS-based pruning algorithms aim to find the sub-network with the best validation performance. However, validation performance does not successfully represent test performance, i.e., potential performance. Also, although fine-tuning the pruned network to restore the performance drop is an inevitable process, few studies have handled this issue. This paper proposes a novel sub-network search and fine-tuning method, i.e., Ensemble Knowledge Guidance (EKG). First, we experimentally prove that the fluctuation of the loss landscape is an effective metric to evaluate the potential performance. In order to search a sub-network with the smoothest loss landscape at a low cost, we propose a pseudo-supernet built by an ensemble sub-network knowledge distillation. Next, we propose a novel fine-tuning that re-uses the information of the search phase. We store the interim sub-networks, that is, the by-products of the search phase, and transfer their knowledge into the pruned network. Note that EKG is easy to be plugged-in and computationally efficient. For example, in the case of ResNet-50, about 45\\% of FLOPS is removed without any performance drop in only 315 GPU hours.\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://user-images.githubusercontent.com/26036843/156765818-05517a8e-498e-4404-9445-acccbf21371d.png\" width=\"900\"\u003e\u003cbr\u003e\n  \u003cb\u003eConceptual visualization of the goal of the proposed method.\u003c/b\u003e  \n\u003c/p\u003e\n\n## Contribution points and key features\n- As a new tool to measure the potential performance of sub-network in NAS-based pruning, the smoothness of the loss landscape is presented. Also, the experimental evidence that the loss landscape fluctuation has a higher correlation with the test performance than the validation performance is provided.\n- The pseudo-supernet based on an ensemble sub-network knowledge distillation is proposed to find a sub-network of smoother loss landscape without increasing complexity. It helps NAS-based pruning to prune all pre-trained networks, and also allows to find optimal sub-network(s) more accurately.\n- To our knowledge, this paper provides the world-first approach to store the information of the search phase in a memory bank and to reuse it in the fine-tuning phase of the pruned network. The proposed memory bank contributes to greatly improving the performance of the pruned network.\n\u003cbr/\u003e\n\n- Supernet-based filter pruning code based on Tensorflow2 \u003cbr/\u003e\n- Custom layers, e.g., concolution, depthwise convolution, batch normalization (see [`nets/tcl.py`](nets/tcl.py))\n- Custom training loop with XLA (JIT) compiling (see [`op_utils.py`](op_utils.py))\n- distributed learning (see [`op_utils.py`](op_utils.py) and [`dataloader`](dataloader))\n- and gradients accumulator (see [`op_utils.py`](op_utils.py) and [`utils/accumulator`](https://github.com/sseung0703/EKG/blob/8f980e143d1253e013b9edfaf267b69dc9ba549a/utils.py#L135-L157) )\n\n## Requirement\n- Tensorflow \u003e= 2.7 (I have tested on 2.7-2.8)\n- Pickle\n- tqdm\n\n## How to run\n1. Move to the codebase.\n2. Train and evaluate our model by the below command.\n```\n  # ResNet-56 on CIFAR10\n  python train_cifar.py --gpu_id 0 --arch ResNet-56 --dataset CIFAR10 --search_target_rate 0.45 --train_path ../test\n  python test.py --gpu_id 0 --arch ResNet-56 --dataset CIFAR10 --trained_param ../test/trained_param.pkl\n```\n\n## Experimental results\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://user-images.githubusercontent.com/26036843/178147853-d1b55959-9c28-4fd1-ac28-e01842b0693b.jpg\" width=\"900\"\u003e\u003cbr\u003e\n  \u003cb\u003e(Left) Potential performance vs. validation loss (right) Potential performance vs. condition number. 50 sub-networks of ResNet-56 trained on CIFAR10 were used for this experiment. accurately.\u003c/b\u003e\n\u003c/p\u003e\n\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://user-images.githubusercontent.com/26036843/156874794-7f0d5099-c89a-40ba-953b-27d19fcb6b85.png\" width=\"900\"\u003e\u003cbr\u003e\n  \u003cb\u003eVisualization of loss landscapes of sub-networks searched by various filter importance scoring algorithms.\u003c/b\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cb\u003eComparison with various pruning techniques for ResNet family trained on ImageNet.\u003c/b\u003e\u003cbr\u003e\n  \u003cimg src=\"https://user-images.githubusercontent.com/26036843/156767848-7e9291d6-7ee3-42fa-849d-e7ebdd04273e.png\" width=\"600\"\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://user-images.githubusercontent.com/26036843/178147880-c258e4a1-30b8-4b04-a9a9-b92e5167b320.jpg\" width=\"900\"\u003e\u003cbr\u003e\n  \u003cb\u003ePerformance analysis in case of ResNet-50 trained on ImageNet-2012. The left plot is the FLOPs reduction rate-Top-1 accuracy, and the right plot is the GPU hours-Top-1 accuracy.\u003c/b\u003e\n\u003c/p\u003e\n\n## Reference\n```\n@article{lee2022ensemble,\n  title        = {Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning},\n  author       = {Seunghyun Lee, Byung Cheol Song},\n  year         = 2022,\n  journal      = {arXiv preprint arXiv:2203.02651}\n}\n\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsseung0703%2Fekg","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsseung0703%2Fekg","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsseung0703%2Fekg/lists"}