{"id":20377559,"url":"https://github.com/reshalfahsi/image-classification-augmentation","last_synced_at":"2026-05-07T20:40:08.466Z","repository":{"id":224916087,"uuid":"764581300","full_name":"reshalfahsi/image-classification-augmentation","owner":"reshalfahsi","description":"Image Classification Using Swin Transformer With RandAugment, CutMix, and MixUp","archived":false,"fork":false,"pushed_at":"2024-02-29T00:59:17.000Z","size":3484,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-01-15T07:14:43.310Z","etag":null,"topics":["caltech256","cutmix","cutmix-augmentation","cutmix-mixup","image-classification","mixup","mixup-cutmix","pytorch","pytorch-lightning","randaugment","swin-transformer","transfer-learning"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/reshalfahsi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-02-28T10:41:56.000Z","updated_at":"2024-02-29T00:56:10.000Z","dependencies_parsed_at":"2024-02-28T11:48:57.831Z","dependency_job_id":"c01d5e48-8b81-4812-98d9-a9cb389608cc","html_url":"https://github.com/reshalfahsi/image-classification-augmentation","commit_stats":null,"previous_names":["reshalfahsi/image-classification-augmentation"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/reshalfahsi%2Fimage-classification-augmentation","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/reshalfahsi%2Fimage-classification-augmentation/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/reshalfahsi%2Fimage-classification-augmentation/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/reshalfahsi%2Fimage-classification-augmentation/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/reshalfahsi","download_url":"https://codeload.github.com/reshalfahsi/image-classification-augmentation/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241928532,"owners_count":20043821,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["caltech256","cutmix","cutmix-augmentation","cutmix-mixup","image-classification","mixup","mixup-cutmix","pytorch","pytorch-lightning","randaugment","swin-transformer","transfer-learning"],"created_at":"2024-11-15T01:45:37.401Z","updated_at":"2025-11-30T21:03:39.378Z","avatar_url":"https://github.com/reshalfahsi.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Image Classification Using Swin Transformer With RandAugment, CutMix, and MixUp\n\n\n \u003cdiv align=\"center\"\u003e\n    \u003ca href=\"https://colab.research.google.com/github/reshalfahsi/image-classification-augmentation/blob/master/Image_Classification_Using_Swin_Transformer_With_RandAugment_CutMix_and_MixUp.ipynb\"\u003e\u003cimg src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"colab\"\u003e\u003c/a\u003e\n    \u003cbr /\u003e\n \u003c/div\u003e\n\n\nIn this project, we will explore three distinct Swin Transformers, i.e., without augmentation, with augmentation, and without using the pre-trained weight (or from scratch). Here, the augmentation is undertaken with RandAugment, CutMix, and MixUp. We are about to witness the consequences of utilizing augmentation and pre-trained weight (transfer learning) on the models on the imbalanced dataset, i.e., Caltech-256. The dataset is split per category with a ratio of ``81``:``9``:``10`` for the training, validation, and testing sets. For the from scratch model, each category is truncated to ``100`` instances. Applying the augmentation and pre-trained weight clearly boosts the performance of the model. Not to mention the pre-trained weight insanely pushes the model to effectively predict the right label in the top-1 and top-5.\n\n\n## Experiment\n\nCheck out this [notebook](https://github.com/reshalfahsi/image-classification-augmentation/blob/master/Image_Classification_Using_Swin_Transformer_With_RandAugment_CutMix_and_MixUp.ipynb) to see and ponder the full implementation.\n\n\n## Result\n\n## Quantitative Result\n\nThe result below shows the performance of three different Swin Transformer models: without augmentation, with augmentation, and from scratch, quantitatively.\n\nModel | Loss | Top-1 Acc. | Top-5 Acc. |\n------------ | ------------- | ------------- | ------------- |\nNo Augmentation |  0.369 | 90.17% | 97.68% |\nAugmentation | 0.347 | 91.57% | 98.75% |\nFrom Scratch | 4.544 | 11.58% | 27.09% |\n\n\n## Validation Accuracy and Loss Curve\n\n\u003cp align=\"center\"\u003e \u003cimg src=\"https://github.com/reshalfahsi/image-classification-augmentation/blob/master/assets/val_acc_curve.png\" alt=\"acc_curve\" \u003e \u003cbr /\u003e Accuracy curves of the models on the validation set. \u003c/p\u003e\n\n\u003cp align=\"center\"\u003e \u003cimg src=\"https://github.com/reshalfahsi/image-classification-augmentation/blob/master/assets/val_loss_curve.png\" alt=\"loss_curve\" \u003e \u003cbr /\u003e Loss curves of the models on the validation set. \u003c/p\u003e\n\n\n## Qualitative Result\n\nThe following collated pictures visually delineate the quality of the prediction of the three models.\n\n\u003cp align=\"center\"\u003e \u003cimg src=\"https://github.com/reshalfahsi/image-classification-augmentation/blob/master/assets/no_aug_qualitative.png\" alt=\"no_aug_qualitative\" \u003e \u003cbr /\u003e The prediction result of Swin Transformer without augmentation. \u003c/p\u003e\n\n\u003cp align=\"center\"\u003e \u003cimg src=\"https://github.com/reshalfahsi/image-classification-augmentation/blob/master/assets/aug_qualitative.png\" alt=\"aug_qualitative\" \u003e \u003cbr /\u003e The prediction result of Swin Transformer with augmentation. \u003c/p\u003e\n\n\u003cp align=\"center\"\u003e \u003cimg src=\"https://github.com/reshalfahsi/image-classification-augmentation/blob/master/assets/scratch_qualitative.png\" alt=\"scratch_qualitative\" \u003e \u003cbr /\u003e The prediction result of Swin Transformer from scratch (no pre-trained). \u003c/p\u003e\n\n\n## Credit\n\n- [Swin Transformer: Hierarchical Vision Transformer using Shifted Windows](https://arxiv.org/pdf/2103.14030.pdf)\n- [TorchVision's Swin Transformer](https://github.com/pytorch/vision/blob/main/torchvision/models/swin_transformer.py)\n- [Image classification with Swin Transformers](https://keras.io/examples/vision/swin_transformers/)\n- [Caltech-256 Object Category Dataset](https://authors.library.caltech.edu/records/5sv1j-ytw97)\n- [TorchVision's Caltech256 Dataset](https://github.com/pytorch/vision/blob/main/torchvision/datasets/caltech.py)\n- [RandAugment: Practical automated data augmentation with a reduced search space](https://arxiv.org/pdf/1909.13719.pdf)\n- [RandAugment for Image Classification for Improved Robustness](https://keras.io/examples/vision/randaugment/)\n- [CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features](https://arxiv.org/pdf/1905.04899.pdf)\n- [CutMix data augmentation for image classification](https://keras.io/examples/vision/cutmix/)\n- [mixup: Beyond Empirical Risk Minimization](https://arxiv.org/pdf/1710.09412.pdf)\n- [MixUp augmentation for image classification](https://keras.io/examples/vision/mixup/)\n- [Multi-head or Single-head? An Empirical Comparison for Transformer Training](https://arxiv.org/pdf/2106.09650.pdf)\n- [Getting 95% Accuracy on the Caltech101 Dataset using Deep Learning](https://debuggercafe.com/getting-95-accuracy-on-the-caltech101-dataset-using-deep-learning/)\n- [How to use CutMix and MixUp](https://pytorch.org/vision/main/auto_examples/transforms/plot_cutmix_mixup.html)\n- [PyTorch Lightning](https://lightning.ai/docs/pytorch/latest/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Freshalfahsi%2Fimage-classification-augmentation","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Freshalfahsi%2Fimage-classification-augmentation","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Freshalfahsi%2Fimage-classification-augmentation/lists"}