{"id":19176531,"url":"https://github.com/deeplite/activ-sparse","last_synced_at":"2025-07-09T03:06:59.575Z","repository":{"id":197252569,"uuid":"678928140","full_name":"Deeplite/activ-sparse","owner":"Deeplite","description":"Official PyTorch training code of Accelerating Deep Neural Networks via Semi-Structured Activation Sparsity (ICCV2023-RCV)","archived":false,"fork":false,"pushed_at":"2023-09-29T15:20:58.000Z","size":190,"stargazers_count":5,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-23T23:33:56.548Z","etag":null,"topics":["deep-neural-networks","efficient-deep-learning","efficient-inference","low-latency","raspberry-pi","sparsity","tinyml"],"latest_commit_sha":null,"homepage":"https://www.deeplite.ai/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Deeplite.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2023-08-15T17:55:59.000Z","updated_at":"2024-12-03T03:01:31.000Z","dependencies_parsed_at":"2023-09-29T21:00:45.189Z","dependency_job_id":null,"html_url":"https://github.com/Deeplite/activ-sparse","commit_stats":null,"previous_names":["deeplite/activ-sparse"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Deeplite%2Factiv-sparse","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Deeplite%2Factiv-sparse/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Deeplite%2Factiv-sparse/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Deeplite%2Factiv-sparse/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Deeplite","download_url":"https://codeload.github.com/Deeplite/activ-sparse/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248115350,"owners_count":21050195,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-neural-networks","efficient-deep-learning","efficient-inference","low-latency","raspberry-pi","sparsity","tinyml"],"created_at":"2024-11-09T10:28:56.508Z","updated_at":"2025-04-09T21:34:58.176Z","avatar_url":"https://github.com/Deeplite.png","language":"Python","readme":"\u003cdiv align=\"center\"\u003e\n\n  \u003ch2 align=\"center\"\u003eAccelerating Deep Neural Networks via Semi-Structured Activation Sparsity\u003c/h3\u003e\n\n  \u003ca href=\"https://arxiv.org/abs/2309.06626\" target=\"_blank\"\u003e\u003cimg src=\"https://img.shields.io/badge/arXiv-2307.13901-b31b1b.svg\" alt=\"arXiv\"\u003e\u003c/a\u003e\n\u003c/div\u003e\n\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"pictures/im2col_intro.png\" width=70%\u003e \u003cbr\u003e\n  Activation sparsity pattern in the tensor vs im2col spaces.\n\u003c/p\u003e\n\n\u003e[Accelerating Deep Neural Networks via Semi-Structured Activation Sparsity](https://arxiv.org/abs/2309.06626)\u003cbr\u003e\n\u003e[Matteo Grimaldi](https://scholar.google.ca/citations?user=Li60pj4AAAAJ\u0026hl=en), [Darshan C Ganji](https://ca.linkedin.com/in/darshancganji), [Ivan Lazarevich](https://scholar.google.ca/citations?user=dRezqREAAAAJ\u0026hl=en\u0026oi=sra), [Sudhakar Sah](https://scholar.google.ca/citations?user=Ruq8ZywAAAAJ\u0026hl=en\u0026oi=sra)\u003cbr\u003e\n\u003eDeeplite Inc.\u003cbr\u003e\n\u003e[Workshop on Resource Efficient Deep Learning for Computer Vision, ICCV23](https://iccv2023.thecvf.com/)\u003cbr\u003e\n\n\n\n\u003cdetails\u003e\n  \u003csummary\u003e\n  \u003cfont size=\"+1\"\u003eAbstract\u003c/font\u003e\n  \u003c/summary\u003e\nThe demand for efficient processing of deep neural networks (DNNs) on embedded\ndevices is a significant challenge limiting their deployment. Exploiting\nsparsity in the network's feature maps is one of the ways to reduce its\ninference latency. It is known that unstructured sparsity results in lower\naccuracy degradation with respect to structured sparsity but the former needs\nextensive inference engine changes to get latency benefits. To tackle this\nchallenge, we propose a solution to induce semi-structured activation sparsity\nexploitable through minor runtime modifications. To attain high speedup levels\nat inference time, we design a sparse training procedure with awareness of the\nfinal position of the activations while computing the General Matrix\nMultiplication (GEMM). We extensively evaluate the proposed solution across\nvarious models for image classification and object detection tasks. Remarkably,\nour approach yields a speed improvement of 1.25× with a minimal accuracy drop of\n1.1% for the ResNet18 model on the ImageNet dataset. Furthermore, when combined\nwith a state-of-the-art structured pruning method, the resulting models provide\na good latency-accuracy trade-off, outperforming models that solely employ\nstructured pruning techniques.\n\u003c/details\u003e\n\n\u003cbr\u003e\n\n\n## Classification on ImageNet-1K\n\n### ResNet18\n\n* Dense baseline: 70.53 %\n\n| Sparsity | Top-1 | Speedup |\n|:-----:|:-----:|:-------:|\n| 0.1   | 70.48 % |   1.11 $\\times$    |\n| 0.2   | 69.42 % |   1.25 $\\times$    |\n| 0.3   | 67.88 % |   1.42 $\\times$    |\n\n### MobileNetV2\n\n* Dense baseline: 72.19 %\n\n| Sparsity | Top-1 | Speedup |\n|:-----:|:-----:|:-------:|\n| 0.1   | 70.43 % |   1.04 $\\times$    |\n| 0.2   | 69.94 % |   1.10 $\\times$    |\n| 0.3   | 67.92 % |   1.20 $\\times$    |\n\n\n## Training Setup\n\n### Prerequisites\n\n```\nsource setup.sh\n```\n\n### Single machine multi-GPU training\n\nWe provide an example script for training ResNet18 on Flowers102 dataset.\n\n```\nsh examples/run_resnet18_flowers102.sh\n```\n\n## Latency Measurement\n\nThe latency speedup was measured on a Raspberry Pi 4B device, featuring a\nquad-core ARM Cortex-A72 processor operating at 1.5GHz, with 4GB of RAM.\nFor deployment, we used TFLite inference engine built with XNNPACK delegate with\ncustom modifications for sparse inference. When `--save_masks` is enabled during\ntraining, binary masks are stored as `.h` header files to include in the inference routine. \n\n\n\n## Citation\n\nIf our code or models help your work, please cite our paper:\n```BibTeX\n@article{grimaldi2023accelerating,\n  title={Accelerating Deep Neural Networks via Semi-Structured Activation Sparsity},\n  author={Grimaldi, Matteo and Ganji, Darshan C and Lazarevich, Ivan and Sah, Sudhakar},\n  journal={arXiv preprint arXiv:2309.06626},\n  year={2023}\n}\n```\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeeplite%2Factiv-sparse","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdeeplite%2Factiv-sparse","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeeplite%2Factiv-sparse/lists"}