{"id":9243923,"url":"https://github.com/albanie/convnet-burden","last_synced_at":"2025-08-17T08:33:45.394Z","repository":{"id":42973387,"uuid":"99331147","full_name":"albanie/convnet-burden","owner":"albanie","description":"Memory consumption and FLOP count estimates for convnets","archived":false,"fork":false,"pushed_at":"2019-01-17T11:15:00.000Z","size":2058,"stargazers_count":915,"open_issues_count":5,"forks_count":115,"subscribers_count":33,"default_branch":"master","last_synced_at":"2024-08-25T22:45:42.576Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"MATLAB","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/albanie.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-08-04T10:11:16.000Z","updated_at":"2024-07-26T08:52:30.000Z","dependencies_parsed_at":"2022-07-15T23:18:01.090Z","dependency_job_id":null,"html_url":"https://github.com/albanie/convnet-burden","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/albanie%2Fconvnet-burden","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/albanie%2Fconvnet-burden/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/albanie%2Fconvnet-burden/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/albanie%2Fconvnet-burden/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/albanie","download_url":"https://codeload.github.com/albanie/convnet-burden/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":230108747,"owners_count":18174540,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-05-08T00:11:59.223Z","updated_at":"2024-12-17T11:30:35.371Z","avatar_url":"https://github.com/albanie.png","language":"MATLAB","funding_links":[],"categories":["Other Machine Learning Applications"],"sub_categories":["Others"],"readme":"convnet-burden\n---\n\nEstimates of memory consumption and FLOP counts for various convolutional neural networks.\n\n\n### Image Classification Architectures\n\nThe numbers below are given for single element batches. \n\n| model | input size | param mem | feat. mem | flops | src | performance |\n|-------|------------|--------------|----------------|-------|-----|-------------|\n| [alexnet](reports/alexnet.md) | 227 x 227 | 233 MB | 3 MB | 727 MFLOPs | MCN | 41.80 / 19.20 |\n| [caffenet](reports/caffenet.md) | 224 x 224 | 233 MB | 3 MB | 724 MFLOPs | MCN | 42.60 / 19.70 |\n| [squeezenet1-0](reports/squeezenet1-0.md) | 224 x 224 | 5 MB | 30 MB | 837 MFLOPs | PT | 41.90 / 19.58 |\n| [squeezenet1-1](reports/squeezenet1-1.md) | 224 x 224 | 5 MB | 17 MB | 360 MFLOPs | PT | 41.81 / 19.38 |\n| [vgg-f](reports/vgg-f.md) | 224 x 224 | 232 MB | 4 MB | 727 MFLOPs | MCN | 41.40 / 19.10 |\n| [vgg-m](reports/vgg-m.md) | 224 x 224 | 393 MB | 12 MB | 2 GFLOPs | MCN | 36.90 / 15.50 |\n| [vgg-s](reports/vgg-s.md) | 224 x 224 | 393 MB | 12 MB | 3 GFLOPs | MCN | 37.00 / 15.80 |\n| [vgg-m-2048](reports/vgg-m-2048.md) | 224 x 224 | 353 MB | 12 MB | 2 GFLOPs | MCN | 37.10 / 15.80 |\n| [vgg-m-1024](reports/vgg-m-1024.md) | 224 x 224 | 333 MB | 12 MB | 2 GFLOPs | MCN | 37.80 / 16.10 |\n| [vgg-m-128](reports/vgg-m-128.md) | 224 x 224 | 315 MB | 12 MB | 2 GFLOPs | MCN | 40.80 / 18.40 |\n| [vgg-vd-16-atrous](reports/vgg-vd-16-atrous.md) | 224 x 224 | 82 MB | 58 MB | 16 GFLOPs | N/A | - / -  |\n| [vgg-vd-16](reports/vgg-vd-16.md) | 224 x 224 | 528 MB | 58 MB | 16 GFLOPs | MCN | 28.50 / 9.90 |\n| [vgg-vd-19](reports/vgg-vd-19.md) | 224 x 224 | 548 MB | 63 MB | 20 GFLOPs | MCN | 28.70 / 9.90 |\n| [googlenet](reports/googlenet.md) | 224 x 224 | 51 MB | 26 MB | 2 GFLOPs | MCN | 34.20 / 12.90 |\n| [resnet18](reports/resnet18.md) | 224 x 224 | 45 MB | 23 MB | 2 GFLOPs | PT | 30.24 / 10.92 |\n| [resnet34](reports/resnet34.md) | 224 x 224 | 83 MB | 35 MB | 4 GFLOPs | PT | 26.70 / 8.58 |\n| [resnet-50](reports/resnet-50.md) | 224 x 224 | 98 MB | 103 MB | 4 GFLOPs | MCN | 24.60 / 7.70 |\n| [resnet-101](reports/resnet-101.md) | 224 x 224 | 170 MB | 155 MB | 8 GFLOPs | MCN | 23.40 / 7.00 |\n| [resnet-152](reports/resnet-152.md) | 224 x 224 | 230 MB | 219 MB | 11 GFLOPs | MCN | 23.00 / 6.70 |\n| [resnext-50-32x4d](reports/resnext-50-32x4d.md) | 224 x 224 | 96 MB | 132 MB | 4 GFLOPs | L1 | 22.60 / 6.49 |\n| [resnext-101-32x4d](reports/resnext-101-32x4d.md) | 224 x 224 | 169 MB | 197 MB | 8 GFLOPs | L1 | 21.55 / 5.93 |\n| [resnext-101-64x4d](reports/resnext-101-64x4d.md) | 224 x 224 | 319 MB | 273 MB | 16 GFLOPs | PT | 20.81 / 5.66 |\n| [inception-v3](reports/inception-v3.md) | 299 x 299 | 91 MB | 89 MB | 6 GFLOPs | PT | 22.55 / 6.44 |\n| [SE-ResNet-50](reports/SE-ResNet-50.md) | 224 x 224 | 107 MB | 103 MB | 4 GFLOPs | SE | 22.37 / 6.36 |\n| [SE-ResNet-101](reports/SE-ResNet-101.md) | 224 x 224 | 189 MB | 155 MB | 8 GFLOPs | SE | 21.75 / 5.72 |\n| [SE-ResNet-152](reports/SE-ResNet-152.md) | 224 x 224 | 255 MB | 220 MB | 11 GFLOPs | SE | 21.34 / 5.54 |\n| [SE-ResNeXt-50-32x4d](reports/SE-ResNeXt-50-32x4d.md) | 224 x 224 | 105 MB | 132 MB | 4 GFLOPs | SE | 20.97 / 5.54 |\n| [SE-ResNeXt-101-32x4d](reports/SE-ResNeXt-101-32x4d.md) | 224 x 224 | 187 MB | 197 MB | 8 GFLOPs | SE | 19.81 / 4.96 |\n| [SENet](reports/SENet.md) | 224 x 224 | 440 MB | 347 MB | 21 GFLOPs | SE | 18.68 / 4.47 |\n| [SE-BN-Inception](reports/SE-BN-Inception.md) | 224 x 224 | 46 MB | 43 MB | 2 GFLOPs | SE | 23.62 / 7.04 |\n| [densenet121](reports/densenet121.md) | 224 x 224 | 31 MB | 126 MB | 3 GFLOPs | PT | 25.35 / 7.83 |\n| [densenet161](reports/densenet161.md) | 224 x 224 | 110 MB | 235 MB | 8 GFLOPs | PT | 22.35 / 6.20 |\n| [densenet169](reports/densenet169.md) | 224 x 224 | 55 MB | 152 MB | 3 GFLOPs | PT | 24.00 / 7.00 |\n| [densenet201](reports/densenet201.md) | 224 x 224 | 77 MB | 196 MB | 4 GFLOPs | PT | 22.80 / 6.43 |\n| [mcn-mobilenet](reports/mcn-mobilenet.md) | 224 x 224 | 16 MB | 38 MB | 579 MFLOPs | AU | 29.40 / - |\n\n\nClick on the model name for a more detailed breakdown of feature extraction costs at different input image/batch sizes if needed.  The performance numbers are reported as `top-1 error/top-5 error` on the 2012 ILSVRC validation data.  The `src` column indicates the source of the benchmark scores using the following abberviations:\n\n* [MCN](http://www.vlfeat.org/matconvnet/pretrained/) - scores obtained from the matconvnet website.\n* [PT](http://pytorch.org/docs/master/torchvision/models.html) - scores obtained from the PyTorch torchvision module.\n* [L1](https://github.com/albanie/mcnPyTorch/blob/master/benchmarks/cnn_imagenet_pt_mcn.m) - evaluated locally (follow link to view benchmark code).\n* AU - numbers reported by the paper authors.\n\n\nThese numbers provide an estimate of performance, but note that there may be small differences between the evaluation scripts from different sources.\n\n**References:**\n\n* [alexnet](http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks) - *Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. \"Imagenet classification with deep convolutional neural networks.\" Advances in neural information processing systems. 2012.*  \n* [squeezenet](https://arxiv.org/abs/1602.07360) - *Iandola, Forrest N., et al. \"SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and\u003c 0.5 MB model size.\" arXiv preprint arXiv:1602.07360 (2016).*\n* [vgg-m](https://arxiv.org/abs/1405.3531) -  *Chatfield, Ken, et al. \"Return of the devil in the details: Delving deep into convolutional nets.\" arXiv preprint arXiv:1405.3531 (2014).*\n* [vgg-vd-16/vgg-vd-19](https://arxiv.org/abs/1409.1556) -  *Simonyan, Karen, and Andrew Zisserman. \"Very deep convolutional networks for large-scale image recognition.\" arXiv preprint arXiv:1409.1556 (2014).*\n* [vgg-vd-16-reduced](https://arxiv.org/abs/1506.04579) - *Liu, Wei, Andrew Rabinovich, and Alexander C. Berg. \"Parsenet: Looking wider to see better.\" arXiv preprint arXiv:1506.04579 (2015)*\n* [googlenet](http://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Szegedy_Going_Deeper_With_2015_CVPR_paper.html) - *Szegedy, Christian, et al. \"Going deeper with convolutions.\" Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.*\n* [inception](https://arxiv.org/abs/1512.00567) - *Szegedy, Christian, et al. \"Rethinking the inception architecture for computer vision.\" Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.*\n* [resnet](https://arxiv.org/abs/1512.03385) - *He, Kaiming, et al. \"Deep residual learning for image recognition.\" Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.*\n* [resnext](https://arxiv.org/abs/1611.05431) - *Xie, Saining, et al. \"Aggregated residual transformations for deep neural networks.\" arXiv preprint arXiv:1611.05431 (2016).*\n* [SENets](https://arxiv.org/abs/1709.01507) - *Jie Hu, Li Shen and Gang Sun. \"Squeeze-and-Excitation Networks.\" arXiv preprint arXiv:1709.01507 (2017).*\n* [Densenet](https://arxiv.org/abs/1608.06993) - *Huang, Gao, et al. \"Densely connected convolutional networks.\" CVPR, (2017).*\n\n### Object Detection Architectures\n\n| model | input size | param memory | feature memory | flops |\n|-------|------------|--------------|----------------|-------|\n| [rfcn-res50-pascal](reports/rfcn-res50-pascal.md) | 600 x 850 | 122 MB | 1 GB | 79 GFLOPS|\n| [rfcn-res101-pascal](reports/rfcn-res101-pascal.md) | 600 x 850 | 194 MB | 2 GB | 117 GFLOPS|\n| [ssd-pascal-vggvd-300](reports/ssd-pascal-vggvd-300.md) | 300 x 300 | 100 MB | 116 MB | 31 GFLOPS|\n| [ssd-pascal-vggvd-512](reports/ssd-pascal-vggvd-512.md) | 512 x 512 | 104 MB | 337 MB | 91 GFLOPS|\n| [ssd-pascal-mobilenet-ft](reports/ssd-pascal-mobilenet-ft.md) | 300 x 300 | 22 MB | 37 MB | 1 GFLOPs|\n| [faster-rcnn-vggvd-pascal](reports/faster-rcnn-vggvd-pascal.md) | 600 x 850 | 523 MB | 600 MB | 172 GFLOPS|\n\nThe input sizes used are \"typical\" for each of the architectures listed, but can be varied.  *Anchor/priorbox* generation and *roi/psroi*-pooling are not included in flop estimates.  The *ssd-pascal-mobilenet-ft* detector uses the MobileNet feature extractor (the model used here was imported from the architecture made available by [chuanqi305](https://github.com/chuanqi305/MobileNet-SSD)).\n\n**References:**\n\n* [faster-rcnn](http://papers.nips.cc/paper/5638-faster-r-cnn-towards-real-time-object-detection-with-region-proposal-networks) - *Ren, Shaoqing, et al. \"Faster R-CNN: Towards real-time object detection with region proposal networks.\" Advances in neural information processing systems. 2015..*  \n* [r-fcn](https://arxiv.org/abs/1605.06409) - *Li, Yi, Kaiming He, and Jian Sun. \"R-fcn: Object detection via region-based fully convolutional networks.\" Advances in Neural Information Processing Systems. 2016.*\n* [ssd](https://link.springer.com/chapter/10.1007%2F978-3-319-46448-0_2) - *Liu, Wei, et al. \"Ssd: Single shot multibox detector.\" European conference on computer vision. Springer, Cham, 2016.*\n* [mobilenets](https://arxiv.org/abs/1704.04861) - *Howard, Andrew G., Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. \"Mobilenets: Efficient convolutional neural networks for mobile vision applications.\" arXiv preprint arXiv:1704.04861 (2017).*\n\n\n### Semantic Segmentation Architectures\n\n| model | input size | param memory | feature memory | flops |\n|-------|------------|--------------|----------------|-------|\n| [pascal-fcn32s](reports/pascal-fcn32s.md) | 384 x 384 | 519 MB | 423 MB | 125 GFLOPS|\n| [pascal-fcn16s](reports/pascal-fcn16s.md) | 384 x 384 | 514 MB | 424 MB | 125 GFLOPS|\n| [pascal-fcn8s](reports/pascal-fcn8s.md) | 384 x 384 | 513 MB | 426 MB | 125 GFLOPS|\n| [deeplab-vggvd-v2](reports/deeplab-vggvd-v2.md) | 513 x 513 | 144 MB | 755 MB | 202 GFLOPs|\n| [deeplab-res101-v2](reports/deeplab-res101-v2.md) | 513 x 513 | 505 MB | 4 GB | 346 GFLOPs|\n\nIn this case, the input sizes are those which are typically taken as input crops during training.  The *deeplab-res101-v2* model uses multi-scale input, with scales `x1, x0.75, x0.5` (computed relative to the given input size).\n\n**References:**\n\n* [pascal-fcn](http://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Long_Fully_Convolutional_Networks_2015_CVPR_paper.html) - *Long, Jonathan, Evan Shelhamer, and Trevor Darrell. \"Fully convolutional networks for semantic segmentation.\" Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015..*\n* [deeplab](https://arxiv.org/abs/1606.00915) - *DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs\nLiang-Chieh Chen^, George Papandreou^, Iasonas Kokkinos, Kevin Murphy, and Alan L. Yuille (^equal contribution)\nTransactions on Pattern Analysis and Machine Intelligence (TPAMI)*\n\n### Keypoint Detection Architectures\n\n| model | input size | param memory | feature memory | flops |\n|-------|------------|--------------|----------------|-------|\n| [multipose-mpi](reports/multipose-mpi.md) | 368 x 368 | 196 MB | 245 MB | 134 GFLOPS|\n| [multipose-coco](reports/multipose-coco.md) | 368 x 368 | 200 MB | 246 MB | 136 GFLOPS|\n\n**References:**\n\n* [multipose](https://arxiv.org/abs/1611.08050) - *Cao, Zhe, et al. \"Realtime multi-person 2d pose estimation using part affinity fields.\" arXiv preprint arXiv:1611.08050 (2016)..*\n\n\n\u003ch3\u003eNotes and Assumptions\u003c/h3\u003e\n\n\nThe numbers for each architecture should be reasonably framework agnostic. It is assumed that all weights and activations are stored as floats (with 4 bytes per datum) and that all relus are performed in-place.  Feature memory therefore represents an estimate of the total memory consumption of the features computed via a forward pass of the network for a given input, assuming that memory is not re-used (the exception to this is that, as noted above, relus are performed in-place and do not add to the feature memory total).  In practice, many frameworks will clear features from memory when they are no-longer required by the execution path and will therefore require less memory than is noted here.  The feature memory statistic is simply a rough guide as to \"how big\" the activations of the network look.\n\nFused multiply-adds are counted as single operations.  The numbers should be considered to be rough approximations -  modern hardware makes it very difficult to accurately count operations (and even if you could, pipelining etc. means that it is not necessarily a good estimate of inference time).\n\nThe tool for computing the estimates is implemented as a module for the [autonn](https://github.com/vlfeat/autonn) wrapper of matconvnet and is included in this [repo](core/burden.m), so feel free to take a look for extra details.  This module can be installed with the `vl_contrib` package manager (it has two dependencies which can be installed in a similar manner: [autonn](https://github.com/vlfeat/autonn) and [mcnExtraLayers](https://github.com/albanie/mcnExtraLayers)). Matconvnet versions of all of the models can be obtained from either [here](http://www.vlfeat.org/matconvnet/pretrained/) or [here](http://www.robots.ox.ac.uk/~albanie/mcn-models.html).\n\nFor further reading on the topic, the 2017 ICLR submission [An analysis of deep neural network models for practical applications](https://openreview.net/pdf?id=Bygq-H9eg) is interesting.  If you find any issues, or would like to add additional models, add an issue/PR.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falbanie%2Fconvnet-burden","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Falbanie%2Fconvnet-burden","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falbanie%2Fconvnet-burden/lists"}