{"id":13578418,"url":"https://github.com/clovaai/assembled-cnn","last_synced_at":"2025-08-01T17:34:32.802Z","repository":{"id":54856414,"uuid":"234492706","full_name":"clovaai/assembled-cnn","owner":"clovaai","description":"Tensorflow implementation of \"Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network\"","archived":false,"fork":false,"pushed_at":"2021-01-25T03:25:05.000Z","size":3997,"stargazers_count":327,"open_issues_count":0,"forks_count":41,"subscribers_count":20,"default_branch":"master","last_synced_at":"2025-05-20T01:11:15.833Z","etag":null,"topics":["computer-vision","convolutional-neural-networks","deep-learning","food-101","image-classification","imagenet","inference-throughput","mce","robustness","tensorflow","transfer-learning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/clovaai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-01-17T07:13:51.000Z","updated_at":"2025-03-21T16:09:05.000Z","dependencies_parsed_at":"2022-08-14T04:50:45.913Z","dependency_job_id":null,"html_url":"https://github.com/clovaai/assembled-cnn","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/clovaai/assembled-cnn","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/clovaai%2Fassembled-cnn","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/clovaai%2Fassembled-cnn/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/clovaai%2Fassembled-cnn/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/clovaai%2Fassembled-cnn/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/clovaai","download_url":"https://codeload.github.com/clovaai/assembled-cnn/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/clovaai%2Fassembled-cnn/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":268266972,"owners_count":24222774,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-01T02:00:08.611Z","response_time":67,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-vision","convolutional-neural-networks","deep-learning","food-101","image-classification","imagenet","inference-throughput","mce","robustness","tensorflow","transfer-learning"],"created_at":"2024-08-01T15:01:30.455Z","updated_at":"2025-08-01T17:34:32.749Z","avatar_url":"https://github.com/clovaai.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network\n\n## What's New\n\nJuly 11, 2020\n* We reimplemented assemble-resnet with tensorflow 2.1. If you want to see the code with better readability, refer to [this branch](https://github.com/clovaai/assembled-cnn/tree/tf2.1).\n\n[paper v2](https://arxiv.org/abs/2001.06268) | [pretrained model](https://drive.google.com/drive/folders/1o8vj8_ZOPByjRKZzRPZMbuoKyxIwd_IZ?usp=sharing) \n\nOfficial Tensorflow implementation  \n\n\u003e [Jungkyu Lee](mailto:jungkyu.lee@navercorp.com), [Taeryun Won](mailto:lory.tail@navercorp.com), [Tae Kwan Lee](mailto:taekwan.lee@navercorp.com), \n\u003e [Hyemin Lee](mailto:hmin.lee@navercorp.com), [Geonmo Gu](mailto:geonmo.gu@navercorp.com), [Kiho Hong](mailto:kiho.hong@navercorp.com)\u003cbr/\u003e\n\u003e @NAVER/LINE Vision\n\n\n**Abstract**\n\n\u003e Recent studies in image classification have demonstrated \na variety of techniques for improving the performance of\nConvolutional Neural Networks (CNNs). However, attempts to combine existing techniques to create a practical model are still uncommon. In this study, we carry out\nextensive experiments to validate that carefully assembling\nthese techniques and applying them to basic CNN models (e.g., ResNet and MobileNet) can improve the accuracy\nand robustness of the models while minimizing the loss of\nthroughput. Our proposed assembled ResNet-50 shows improvements in top-1 accuracy from 76.3% to 82.78%, mCE\nfrom 76.0% to 48.9% and mFR from 57.7% to 32.3% on\nILSVRC2012 validation set. With these improvements, inference throughput only decreases from 536 to 312. To verify the performance improvement in transfer learning, fine\ngrained classification and image retrieval tasks were tested\non several public datasets and showed that the improvement\nto backbone network performance boosted transfer learning performance significantly. Our approach achieved 1st place in the iFood Competition Fine-Grained Visual Recognition at CVPR 2019\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"./figures/summary_architecture.png\" align=\"center\" width=\"1000\" title=\"summary_architecture\"\u003e\n\u003c/p\u003e\n\n\n## Main Results\n\n### Summary of key results\n\n\u003cp align=\"center\"\u003e\n \u003cimg src=\"./figures/summary_table.png\" align=\"center\" width=\"500\" title=\"summary_table\" \u003e\n\u003c/p\u003e\n\n\n### Ablation Study\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"./figures/ablation_study_imagenet.png\" align=\"center\" width=\"1000\" title=\"summary_table\"\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"./figures/ablation_study_mobilenet.png\" align=\"center\" width=\"1000\" title=\"summary_table\"\u003e\n\u003c/p\u003e\n\n### Transfer learning\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"./figures/FGVC.png\" align=\"center\" width=\"700\" title=\"FGVC\"\u003e\n\u003c/p\u003e\n\n\n## Honor\n\n* Based on our repository, we achieved 1st place in [iFood Competition Fine-Grained Visual Recognition at CVPR 2019](https://www.kaggle.com/c/ifood-2019-fgvc6/leaderboard).\n\n## Related links\n\nThankfully some people have written testimonial and  posts related to our paper.\n\n* [Jeremy Howard's  testimonial tweet](https://twitter.com/jeremyphoward/status/1219695492927328256)\n* [Stan Kriventsov's summary post](https://dl.reviews/2020/01/26/compounding-resnet-improvements/)\n* [akira's summary post](https://medium.com/analytics-vidhya/assemble-resnet-that-is-5-times-faster-with-the-same-accuracy-as-efficientnet-b6-autoaugment-c752f1835c38)\n* [norman3(one of authurs)'s korean version paper](https://norman3.github.io/papers/docs/assembled_cnn)\n\n## Tutorial: Fine-Tuning on Oxford-flower102\n\nAs first try, you can fine-tune your flower classifier in colab.\n \n [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/11GJf-frlk_mj30h_cZoRJZ5W6j2lO9HM)\n\n\n## Getting Started\n\n* This work was tested with Tensorflow 1.14.0, CUDA 10.0, python 3.6.\n\n### Requirements\n\n```bash\npip install Pillow sklearn requests Wand tqdm\n```\n\n\n### Data preparation\n\nWe assume you already have the following data:\n* ImageNet2012 raw images and tfrecord. For this data, please refer to [here](https://github.com/simnalamburt/models/tree/clovaai-assembled-cnn/research/slim#an-automated-script-for-processing-imagenet-data)\n* For knowledge distillation, you need to add the teacher's logits to the TFRecord according to [here](./kd/README.md)\n* For transfer learing datasets, refer to scripts in [here](./datasets)\n* To download pretrained model, visit [here](https://drive.google.com/drive/folders/1o8vj8_ZOPByjRKZzRPZMbuoKyxIwd_IZ?usp=sharing)\n* To make mCE evaluation dataset. refer to [here](./datasets/CE_dataset/README.MD)\n\n### Reproduce Results\n\nFirst, download pretrained models from [here](https://drive.google.com/drive/folders/1o8vj8_ZOPByjRKZzRPZMbuoKyxIwd_IZ?usp=sharing).\n\nFor Assemble-ResNet50, \n\n```bash\nDATA_DIR=/path/to/imagenet2012/tfrecord\nMODEL_DIR=/path/Assemble-ResNet50/checkpoint\nCUDA_VISIBLE_DEVICES=1 python main_classification.py \\\n--eval_only=True \\\n--dataset_name=imagenet \\\n--data_dir=${DATA_DIR} \\\n--model_dir=${MODEL_DIR} \\\n--preprocessing_type=imagenet_224_256 \\\n--resnet_version=2 \\\n--resnet_size=50 \\\n--use_sk_block=True \\\n--use_resnet_d==False \\\n--anti_alias_type=sconv \\\n--anti_alias_filter_size=3 \n```\n\nNote that `use_resnet_d==False`.\nWe have implemented BigLittleNet with reference to the official implementation of [BigLittleNet](https://github.com/IBM/BigLittleNet)\nWe found that BigLittleNet's official implementation already includes the concept of resnet-d.\nthat is, in both [`resnet_d_projection_shortcut`](https://github.com/clovaai/assembled-cnn/blob/master/nets/resnet_model.py#L123) and [`bl_projection_shortcut`](https://github.com/clovaai/assembled-cnn/blob/master/nets/resnet_model.py#L133), a average\npooling layer has been added with a stride of 2 before the convolution(except pooling size is different).\nSo we described it in the paper as D + BL.\nHowever, when using BL, we did not use tweak that replaces 7x7 convolution with three 3x3 conv(so it become `use_resnet_d=False`) because it made training unstable.\nI thought it was a little tricky.  We will further explain it in the v2 version of our paper.\n\nFor Assemble-ResNet152, \n\n```bash\nDATA_DIR=/path/to/imagenet2012/tfrecord\nMODEL_DIR=/path/Assemble-ResNet152/checkpoint\nCUDA_VISIBLE_DEVICES=1 python main_classification.py \\\n--eval_only=True \\\n--dataset_name=imagenet \\\n--data_dir=${DATA_DIR} \\\n--model_dir=${MODEL_DIR} \\\n--preprocessing_type=imagenet_224_256a \\\n--resnet_version=2 \\\n--resnet_size=152 \\\n--bl_alpha=1 \\\n--bl_beta=2 \\\n--use_sk_block=True \\\n--anti_alias_type=sconv \\\n--anti_alias_filter_size=3 \n```\n\nFor Assemble-ResNet 152, `preprocessing_type=imagenet_224_256a`(resize the shorter size of each image to 257 pixels while\nthe aspect ratio is maintained. Next, we center crop the image to the 256x256 size) performed better.\n\nThe expected final output is`\n\n```\n...\n| accuracy:   0.841860 |\n...\n```\n\n## Training a model from scratch.\n\nFor training parameter information, refer to [here](./nets/hparams_config.py)\n\nTrain vanila ResNet50 on ImageNet from scratch.\n\n```console\n$ ./scripts/train_vanila_from_scratch.sh\n```\n\nTrain all-assemble ResNet50 on ImageNet from scratch.\n\n```console\n$ ./scripts/train_assemble_from_scratch.sh\n```\n\n## Fine-tuning the model.\n\nIn the previous section, you train the pretrained model from scratch.\nYou can also download pretrained model to finetune from [here](https://drive.google.com/drive/folders/1o8vj8_ZOPByjRKZzRPZMbuoKyxIwd_IZ?usp=sharing).\n\nFine-tune vanila ResNet50 on Food101.\n\n```console\n$ ./scripts/finetuning_vanila_on_food101.sh\n```\n\nTrain all-assemble ResNet50 on Food101.\n\n```console\n$ ./scripts/finetuning_assemble_on_food101.sh\n```\n\n\n## mCE evaluation\n\nYou can calculate mCE on the trained model as follows: \n\n```console\n$ ./eval_assemble_mCE_on_imagenet.sh\n```\n\n \n## Acknowledgements\nThis implementation is based on these repository:\n* resnet official: https://github.com/tensorflow/models/tree/master/official/r1/resnet\n* mce: https://github.com/hendrycks/robustness\n* autoaugment: https://github.com/tensorflow/tpu/blob/master/models/official/efficientnet/autoaugment.py\n\n## Contact\nFeel free to create a issue or contact me if there is any question (Jungkyu Lee jungkyu.lee@navercorp.com).\n\n## Citation\n\n```\n@misc{lee2020compounding,\n    title={Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network},\n    author={Jungkyu Lee, Taeryun Won, Tae Kwan Lee, Hyemin Lee, Geonmo Gu, Kiho Hong},\n    year={2020},\n    eprint={2001.06268v2},\n    archivePrefix={arXiv},\n    primaryClass={cs.CV}\n}\n```\n\n## License\n\n```\n   Copyright 2020-present NAVER Corp.\n\n   Licensed under the Apache License, Version 2.0 (the \"License\");\n   you may not use this file except in compliance with the License.\n   You may obtain a copy of the License at\n\n       http://www.apache.org/licenses/LICENSE-2.0\n\n   Unless required by applicable law or agreed to in writing, software\n   distributed under the License is distributed on an \"AS IS\" BASIS,\n   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n   See the License for the specific language governing permissions and\n   limitations under the License.\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fclovaai%2Fassembled-cnn","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fclovaai%2Fassembled-cnn","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fclovaai%2Fassembled-cnn/lists"}