{"id":21606107,"url":"https://github.com/bwconrad/vit-finetune","last_synced_at":"2025-06-21T23:02:22.304Z","repository":{"id":114901805,"uuid":"532769222","full_name":"bwconrad/vit-finetune","owner":"bwconrad","description":"Fine-tuning Vision Transformers on various classification datasets","archived":false,"fork":false,"pushed_at":"2024-08-31T20:04:06.000Z","size":65,"stargazers_count":107,"open_issues_count":0,"forks_count":17,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-09T04:41:21.576Z","etag":null,"topics":["deep-learning","huggingface","image-classification","pytorch","pytorch-lightning","vision-transformer"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bwconrad.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"license","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-09-05T06:21:24.000Z","updated_at":"2025-03-27T17:32:44.000Z","dependencies_parsed_at":"2023-04-13T14:31:37.062Z","dependency_job_id":"36e57c44-0ce6-41a6-8333-58ec77463174","html_url":"https://github.com/bwconrad/vit-finetune","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/bwconrad/vit-finetune","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bwconrad%2Fvit-finetune","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bwconrad%2Fvit-finetune/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bwconrad%2Fvit-finetune/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bwconrad%2Fvit-finetune/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bwconrad","download_url":"https://codeload.github.com/bwconrad/vit-finetune/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bwconrad%2Fvit-finetune/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261206064,"owners_count":23124832,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","huggingface","image-classification","pytorch","pytorch-lightning","vision-transformer"],"created_at":"2024-11-24T20:19:11.222Z","updated_at":"2025-06-21T23:02:17.287Z","avatar_url":"https://github.com/bwconrad.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Fine-tuning Vision Transformers\nCode for fine-tuning ViT models on various classification datasets. Includes options for full model, LoRA and linear fine-tuning procedures.\n\n\n## Available Datasets\n\n| Dataset            | `--data.dataset` |\n|:------------------:|:-----------:|\n|[CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html)| `cifar10`|\n|[CIFAR-100](https://www.cs.toronto.edu/~kriz/cifar.html)| `cifar100`|\n|[Oxford-IIIT Pet Dataset](https://www.robots.ox.ac.uk/~vgg/data/pets/)|  `pets37`|\n|[Oxford Flowers-102](https://www.robots.ox.ac.uk/~vgg/data/flowers/102/)|  `flowers102`|\n|[Food-101](https://www.robots.ox.ac.uk/~vgg/data/flowers/102/)|  `food101`|\n|[STL-10](https://cs.stanford.edu/~acoates/stl10/)|  `stl10`|\n|[Describable Textures Dataset](https://www.robots.ox.ac.uk/~vgg/data/dtd/) | `dtd`|\n|[Stanford Cars](https://ai.stanford.edu/~jkrause/cars/car_dataset.html) | `cars`|\n|[FGVC Aircraft](https://www.robots.ox.ac.uk/~vgg/data/fgvc-aircraft/) | `aircraft`|\n|[Image Folder](https://pytorch.org/vision/stable/generated/torchvision.datasets.ImageFolder.html) | `custom`|\n\n\n## Requirements\n- Python 3.8+\n- `pip install -r requirements.txt`\n\n\n## Usage\n### Training\n- To fine-tune a ViT-B/16 model on CIFAR-100 run:\n```\npython main.py fit --trainer.accelerator gpu --trainer.devices 1 --trainer.precision 16-mixed\n--trainer.max_steps 5000 --model.warmup_steps 500 --model.lr 0.01\n--trainer.val_check_interval 500 --data.batch_size 128 --data.dataset cifar100\n```\n- [`config/`](configs/) contains example configuration files which can be run with:\n```\npython main.py fit --config path/to/config\n```\n- To get a list of all arguments run `python train.py --help`\n\n#### Training on a Custom Dataset\nTo train on a custom dataset first organize the images into \n[Image Folder](https://pytorch.org/vision/stable/generated/torchvision.datasets.ImageFolder.html) \nformat. Then set `--data.dataset custom`, `--data.root path/to/custom/dataset` and `--data.num_classes \u003cnum-dataset-classes\u003e`.\n\n### Evaluate\nTo evaluate a trained model on its test set, find the path of the saved config file for the checkpoint (eg. `output/cifar10/version_0/config.yaml`) and run:\n```\npython main.py test --ckpt_path path/to/checkpoint --config path/to/config\n```\n- __Note__: Make sure the `--trainer.precision` argument is set to the same level as used during training.\n\n\n## Results\nAll results are from fine-tuned ViT-B/16 models which were pretrained on ImageNet-21k (`--model.model_name vit-b16-224-in21k`).\n\n#### Full Fine-tuning\n\n| Dataset            | Steps          | Warm Up Steps     | Learning Rate      | Test Accuracy | Config                              | \n|:------------------:|:--------------:|:-----------------:|:------------------:|:-------------:|:-----------------------------------:|\n| CIFAR-10           | 5000           | 500               | 0.01               | 99.00         | [Link](configs/full/cifar10.yaml)   |\n| CIFAR-100          | 5000           | 500               | 0.01               | 92.89         | [Link](configs/full/cifar100.yaml)  |\n| Oxford Flowers-102 | 1000           | 100               | 0.03               | 99.02         | [Link](configs/full/flowers102.yaml)|\n| Oxford-IIIT Pets   | 2000           | 200               | 0.01               | 93.68         | [Link](configs/full/pets37.yaml)    |\n| Food-101           | 5000           | 500               | 0.03               | 90.67         | [Link](configs/full/food101.yaml)   |\n\n#### LoRA\n\n| Dataset            | r  | Alpha | Bias | Steps | Warm Up Steps | Learning Rate | Test Accuracy | Config                                   | \n|:------------------:|:--:|:-----:|:----:|:-----:|:-------------:|:-------------:|:-------------:|:----------------------------------------:|\n| CIFAR-100          | 8  | 8     | None | 5000  | 500           | 0.05          | 92.40         | [Link](configs/lora/cifar100-r8.yaml)    |\n| Oxford-IIIT Pets   | 1  | 16    | None | 3000  | 100           | 0.05          | 93.30         | [Link](configs/lora/pets37-r1.yaml)      |\n| Oxford-IIIT Pets   | 8  | 8     | None | 3000  | 100           | 0.05          | 93.79         | [Link](configs/lora/pets37-r8.yaml)      |\n| Oxford-IIIT Pets   | 8  | 8     | All  | 3000  | 300           | 0.05          | 93.76         | [Link](configs/lora/pets37-r8-bias.yaml) |\n\n#### Linear Probe\n\n| Dataset            | Steps          | Warm Up Steps     | Learning Rate      | Test Accuracy | Config                                | \n|:------------------:|:--------------:|:-----------------:|:------------------:|:-------------:|:-------------------------------------:|\n| Oxford Flowers-102 | 2000           | 100               | 1.0                | 99.02         | [Link](configs/linear/flowers102.yaml)|\n| Oxford-IIIT Pets   | 2000           | 100               | 0.5                | 92.64         | [Link](configs/linear/pets37.yaml)    |\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbwconrad%2Fvit-finetune","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbwconrad%2Fvit-finetune","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbwconrad%2Fvit-finetune/lists"}