{"id":13577147,"url":"https://github.com/chou141253/FGVC-PIM","last_synced_at":"2025-04-05T11:31:27.998Z","repository":{"id":38009599,"uuid":"457212708","full_name":"chou141253/FGVC-PIM","owner":"chou141253","description":"Pytorch implementation for \"A Novel Plug-in Module for Fine-Grained Visual Classification\". fine-grained visual classification task.","archived":false,"fork":false,"pushed_at":"2023-04-01T08:53:45.000Z","size":1477,"stargazers_count":188,"open_issues_count":22,"forks_count":39,"subscribers_count":3,"default_branch":"master","last_synced_at":"2024-11-05T14:43:41.296Z","etag":null,"topics":["efficientnet","fgvc","fine-grained-visual-categorization","resnet","swin-transformer","vision-transformer"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/chou141253.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2022-02-09T04:50:44.000Z","updated_at":"2024-10-28T08:11:05.000Z","dependencies_parsed_at":"2024-01-07T17:11:11.223Z","dependency_job_id":"56d525e6-cb94-43cf-82a0-6ba49b995c81","html_url":"https://github.com/chou141253/FGVC-PIM","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chou141253%2FFGVC-PIM","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chou141253%2FFGVC-PIM/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chou141253%2FFGVC-PIM/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/chou141253%2FFGVC-PIM/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/chou141253","download_url":"https://codeload.github.com/chou141253/FGVC-PIM/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247330794,"owners_count":20921696,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["efficientnet","fgvc","fine-grained-visual-categorization","resnet","swin-transformer","vision-transformer"],"created_at":"2024-08-01T15:01:18.496Z","updated_at":"2025-04-05T11:31:26.417Z","avatar_url":"https://github.com/chou141253.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"\n# A Novel Plug-in Module for Fine-grained Visual Classification\n\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-novel-plug-in-module-for-fine-grained-1/fine-grained-image-classification-on-cub-200)](https://paperswithcode.com/sota/fine-grained-image-classification-on-cub-200?p=a-novel-plug-in-module-for-fine-grained-1)\n\n[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-novel-plug-in-module-for-fine-grained-1/fine-grained-image-classification-on-nabirds)](https://paperswithcode.com/sota/fine-grained-image-classification-on-nabirds?p=a-novel-plug-in-module-for-fine-grained-1)\n\npaper url: https://arxiv.org/abs/2202.03822 \n\nWe propose a novel plug-in module that can be integrated to many common\nbackbones, including CNN-based or Transformer-based networks to provide strongly discriminative regions. The plugin module can output pixel-level feature maps and fuse filtered features to enhance fine-grained visual classification. Experimental results show that the proposed plugin module outperforms state-ofthe-art approaches and significantly improves the accuracy to **92.77%** and **92.83%** on CUB200-2011 and NABirds, respectively.\n\n![framework](./imgs/0001.png)\n\n## 1. Environment setting \n\n// We move old version to ./v0/\n\n### 1.0. Package\n* install requirements\n* replace folder timm/ to our timm/ folder (for ViT or Swin-T)  \n    \n    #### pytorch model implementation [timm](https://github.com/rwightman/pytorch-image-models)\n    #### recommand [anaconda](https://www.anaconda.com/products/distribution)\n    #### recommand [weights and biases](https://wandb.ai/site)\n    #### [deepspeed](https://www.deepspeed.ai/getting-started/) // future works\n\n### 1.1. Dataset\nIn this paper, we use 2 large bird's datasets to evaluate performance:\n* [CUB-200-2011](http://www.vision.caltech.edu/visipedia/CUB-200-2011.html)\n* [NA-Birds](https://dl.allaboutbirds.org/nabirds)\n\n### 1.2. Our pretrained model\n\n* our pretrained model in https://idocntnu-my.sharepoint.com/:f:/g/personal/81075001h_eduad_ntnu_edu_tw/EkypiS-W0SFDkxnHN1Imv5oBPgoRblDgW8wHuVA0c6Ka7Q?e=FhBLDC\n* cub200 and nabird dataset: https://idocntnu-my.sharepoint.com/:f:/g/personal/81075001h_eduad_ntnu_edu_tw/EoBb2gijwclEulDGxv_hOtIBeKuV3M6qy3IGIGMhm-jq0g?e=tcg6tm\n* resnet50_miil_21k.pth and vit_base_patch16_224_miil_21k.pth are imagenet21k pretrained model (place these file under models/), thanks to https://github.com/Alibaba-MIIL/ImageNet21K/blob/main/MODEL_ZOO.md !!\n\n### 1.3. OS\n- [x] Windows10\n- [x] Ubuntu20.04\n- [x] macOS (CPU only)\n\n## 2. Train\n- [x] Single GPU Training\n- [x] DataParallel (single machine multi-gpus)\n- [ ] DistributedDataParallel\n\n(more information: https://pytorch.org/tutorials/intermediate/ddp_tutorial.html)\n\n### 2.1. data\ntrain data and test data structure:  \n```\n├── tain/\n│   ├── class1/\n│   |   ├── img001.jpg\n│   |   ├── img002.jpg\n│   |   └── ....\n│   ├── class2/\n│   |   ├── img001.jpg\n│   |   ├── img002.jpg\n│   |   └── ....\n│   └── ....\n└──\n```\n\n### 2.2. configuration\nyou can directly modify yaml file (in ./configs/)\n\n### 2.3. run\n```\npython main.py --c ./configs/CUB200_SwinT.yaml\n```\nmodel will save in ./records/{project_name}/{exp_name}/backup/\n\n\n### 2.4. about costom model\nBuilding model refers to ./models/builder.py   \nMore detail in [how_to_build_pim_model.ipynb](./how_to_build_pim_model.ipynb)\n\n### 2.5. multi-gpus\ncomment out main.py line 66\n```\nmodel = torch.nn.DataParallel(model, device_ids=None)\n```\n\n### 2.6.  automatic mixed precision (amp)\nuse_amp: True, training time about 3-hours.  \nuse_amp: False, training time about 5-hours.  \n\n## 3. Evaluation\nIf you want to evaluate our pretrained model (or your model), please give provide configs/eval.yaml (or costom yaml file is fine)\n\n### 3.1. please check yaml\nset yaml (configuration file)\nKey           | Value  | Description | \n--------------|:------|:------------| \ntrain_root    | ~      | set value to ~ (null) means this is not in training mode.  |\nval_root  | ../data/eval/  |  path to validation samples |\npretrained  | ./pretrained/best.pt  |   pretrained model path |\n\n\n../data/eval/ folder structure:  \n```\n├── eval/\n│   ├── class1/\n│   |   ├── img001.jpg\n│   |   ├── img002.jpg\n│   |   └── ....\n│   ├── class2/\n│   |   ├── img001.jpg\n│   |   ├── img002.jpg\n│   |   └── ....\n│   └── ....\n└──\n```\n\n### 3.2. run\n```\npython main.py --c ./configs/eval.yaml\n```\nresults will show in terminal and been save in ./records/{project_name}/{exp_name}/eval_results.txt\n\n## 4. HeatMap\n```\npython heat.py --c ./configs/CUB200_SwinT.yaml --img ./vis/001.jpg --save_img ./vis/001/\n```\n![visualization](./vis/001/rbg_img.jpg)\n![visualization2](./vis/001/mix.jpg)\n\n## 5. Infer\nIf you want to reason your picture and get the confusion matrix, please give provide configs/eval.yaml (or costom yaml file is fine)\n\n\n### 5.1. please check yaml\nset yaml (configuration file)\nKey           | Value  | Description | \n--------------|:------|:------------| \ntrain_root    | ~      | set value to ~ (null) means this is not in training mode.  |\nval_root  | ../data/eval/  |  path to validation samples |\npretrained  | ./pretrained/best.pt  |   pretrained model path |\n\n\n../data/eval/ folder structure:  \n```\n├── eval/\n│   ├── class1/\n│   |   ├── img001.jpg\n│   |   ├── img002.jpg\n│   |   └── ....\n│   ├── class2/\n│   |   ├── img001.jpg\n│   |   ├── img002.jpg\n│   |   └── ....\n│   └── ....\n└──\n```\n\n### 5.2. run\n```\npython infer.py --c ./configs/eval.yaml\n```\nresults will show in terminal and been save in ./records/{project_name}/{exp_name}/infer_results.txt\n\n- - - - - - \n\n### Acknowledgment\n\n* Thanks to [timm](https://github.com/rwightman/pytorch-image-models) for Pytorch implementation.\n\n* This work was financially supported by the National Taiwan Normal University (NTNU) within the framework of the Higher Education Sprout Project by the Ministry of Education(MOE) in Taiwan, sponsored by Ministry of Science and Technology, Taiwan, R.O.C. under Grant no. MOST 110-\n2221-E-003-026, 110-2634-F-003 -007, and 110-2634-F-003 -006. In addition, we thank to National Center for Highperformance Computing (NCHC) for providing computational and storage resources.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchou141253%2FFGVC-PIM","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fchou141253%2FFGVC-PIM","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchou141253%2FFGVC-PIM/lists"}