{"id":19382860,"url":"https://github.com/locuslab/convmixer-cifar10","last_synced_at":"2025-04-23T20:32:30.612Z","repository":{"id":96318435,"uuid":"451672906","full_name":"locuslab/convmixer-cifar10","owner":"locuslab","description":"Simple CIFAR-10 classification with ConvMixer","archived":false,"fork":false,"pushed_at":"2022-01-25T00:31:10.000Z","size":12,"stargazers_count":43,"open_issues_count":1,"forks_count":10,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-04-02T20:11:23.654Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/locuslab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-01-24T23:49:10.000Z","updated_at":"2024-12-27T14:43:30.000Z","dependencies_parsed_at":"2023-03-30T20:48:15.148Z","dependency_job_id":null,"html_url":"https://github.com/locuslab/convmixer-cifar10","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/locuslab%2Fconvmixer-cifar10","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/locuslab%2Fconvmixer-cifar10/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/locuslab%2Fconvmixer-cifar10/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/locuslab%2Fconvmixer-cifar10/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/locuslab","download_url":"https://codeload.github.com/locuslab/convmixer-cifar10/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250509867,"owners_count":21442514,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-10T09:23:37.656Z","updated_at":"2025-04-23T20:32:30.605Z","avatar_url":"https://github.com/locuslab.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"Train ConvMixer on CIFAR-10\n-----------------------------------\n ✈️ 🚗 🐦 🐈 🦌 🐕 🐸 🐎 🚢 🚚\n \n\nThis is a simple ConvMixer training script for CIFAR-10. It's probably a good starting point for new experiments on small datasets.\n\nFor training on ImageNet and/or reproducing our original results, see the [main ConvMixer repo](https://github.com/locuslab/convmixer).\n\nYou can get around **92.5%** accuracy in just **25 epochs** by running the script with the following arguments,\nwhich trains a ConvMixer-256/8 with kernel size 5 and patch size 2.\n\n```\npython train.py --lr-max=0.05 --ra-n=2 --ra-m=12 --wd=0.005 --scale=1.0 --jitter=0 --reprob=0\n```\n\nHere's an example of the output of the above command (on a 2080Ti GPU):\n\n```\n[ConvMixer] Epoch: 0  | Train Acc: 0.3938, Test Acc: 0.5418, Time: 43.2, lr: 0.005000\n[ConvMixer] Epoch: 1  | Train Acc: 0.6178, Test Acc: 0.6157, Time: 42.6, lr: 0.010000\n[ConvMixer] Epoch: 2  | Train Acc: 0.7012, Test Acc: 0.7069, Time: 42.6, lr: 0.015000\n[ConvMixer] Epoch: 3  | Train Acc: 0.7383, Test Acc: 0.7708, Time: 42.7, lr: 0.020000\n[ConvMixer] Epoch: 4  | Train Acc: 0.7662, Test Acc: 0.7344, Time: 42.5, lr: 0.025000\n[ConvMixer] Epoch: 5  | Train Acc: 0.7751, Test Acc: 0.7655, Time: 42.4, lr: 0.030000\n[ConvMixer] Epoch: 6  | Train Acc: 0.7901, Test Acc: 0.8328, Time: 42.6, lr: 0.035000\n[ConvMixer] Epoch: 7  | Train Acc: 0.7974, Test Acc: 0.7655, Time: 42.4, lr: 0.040000\n[ConvMixer] Epoch: 8  | Train Acc: 0.8040, Test Acc: 0.8138, Time: 42.6, lr: 0.045000\n[ConvMixer] Epoch: 9  | Train Acc: 0.8084, Test Acc: 0.7891, Time: 42.5, lr: 0.050000\n[ConvMixer] Epoch: 10 | Train Acc: 0.8237, Test Acc: 0.8387, Time: 42.8, lr: 0.045250\n[ConvMixer] Epoch: 11 | Train Acc: 0.8373, Test Acc: 0.8312, Time: 42.6, lr: 0.040500\n[ConvMixer] Epoch: 12 | Train Acc: 0.8529, Test Acc: 0.8563, Time: 42.5, lr: 0.035750\n[ConvMixer] Epoch: 13 | Train Acc: 0.8657, Test Acc: 0.8700, Time: 42.7, lr: 0.031000\n[ConvMixer] Epoch: 14 | Train Acc: 0.8751, Test Acc: 0.8527, Time: 42.6, lr: 0.026250\n[ConvMixer] Epoch: 15 | Train Acc: 0.8872, Test Acc: 0.8907, Time: 42.5, lr: 0.021500\n[ConvMixer] Epoch: 16 | Train Acc: 0.8979, Test Acc: 0.9019, Time: 42.7, lr: 0.016750\n[ConvMixer] Epoch: 17 | Train Acc: 0.9080, Test Acc: 0.9068, Time: 42.9, lr: 0.012000\n[ConvMixer] Epoch: 18 | Train Acc: 0.9198, Test Acc: 0.9139, Time: 42.5, lr: 0.007250\n[ConvMixer] Epoch: 19 | Train Acc: 0.9316, Test Acc: 0.9240, Time: 42.6, lr: 0.002500\n[ConvMixer] Epoch: 20 | Train Acc: 0.9383, Test Acc: 0.9238, Time: 42.8, lr: 0.002000\n[ConvMixer] Epoch: 21 | Train Acc: 0.9407, Test Acc: 0.9248, Time: 42.5, lr: 0.001500\n[ConvMixer] Epoch: 22 | Train Acc: 0.9427, Test Acc: 0.9253, Time: 42.6, lr: 0.001000\n[ConvMixer] Epoch: 23 | Train Acc: 0.9445, Test Acc: 0.9255, Time: 42.5, lr: 0.000500\n[ConvMixer] Epoch: 24 | Train Acc: 0.9441, Test Acc: 0.9260, Time: 42.6, lr: 0.000000\n```\n\nBy adding more regularization (data augmentation) and training for four times longer, you can get **more than 94% accuracy**:\n\n```\npython train.py --lr-max=0.05 --ra-n=2 --ra-m=12 --wd=0.005 --scale=1.0 --jitter=0.2 --reprob=0.2 --epochs=100\n```\n\n\nNote that this script is not intended to perfectly replicate the results in our paper, as PyTorch's built-in data augmentation methods (for RandAugment and Random Erasing in particular) differ slightly from those of the library we used, [pytorch-image-models](https://github.com/rwightman/pytorch-image-models). This script also does not include Mixup/Cutmix, as these are not provided by PyTorch (torchvision.transforms) and we wanted to keep it as simple as possible. That said, you can probably get similar results by experimenting with different amounts of regularization with this script.\n\nFeel free to open an issue if you have any questions.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flocuslab%2Fconvmixer-cifar10","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flocuslab%2Fconvmixer-cifar10","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flocuslab%2Fconvmixer-cifar10/lists"}