{"id":15707845,"url":"https://github.com/ericlbuehler/perceiverio-classifier","last_synced_at":"2025-05-12T19:53:55.817Z","repository":{"id":152609102,"uuid":"454584165","full_name":"EricLBuehler/PerceiverIO-Classifier","owner":"EricLBuehler","description":"A classifier based on PerceiverIO","archived":false,"fork":false,"pushed_at":"2022-05-03T23:14:16.000Z","size":278,"stargazers_count":8,"open_issues_count":0,"forks_count":1,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-04-01T02:52:54.841Z","etag":null,"topics":["classifier","computer-vision","deep-learning","mnist-classification","perceiver-io","pytorch"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-2-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/EricLBuehler.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-02-01T23:24:32.000Z","updated_at":"2024-07-01T14:36:24.000Z","dependencies_parsed_at":null,"dependency_job_id":"f24d29ac-58c4-4904-bccc-449918313b80","html_url":"https://github.com/EricLBuehler/PerceiverIO-Classifier","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EricLBuehler%2FPerceiverIO-Classifier","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EricLBuehler%2FPerceiverIO-Classifier/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EricLBuehler%2FPerceiverIO-Classifier/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/EricLBuehler%2FPerceiverIO-Classifier/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/EricLBuehler","download_url":"https://codeload.github.com/EricLBuehler/PerceiverIO-Classifier/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253812789,"owners_count":21968361,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["classifier","computer-vision","deep-learning","mnist-classification","perceiver-io","pytorch"],"created_at":"2024-10-03T20:41:33.127Z","updated_at":"2025-05-12T19:53:55.794Z","avatar_url":"https://github.com/EricLBuehler.png","language":"Jupyter Notebook","readme":"# PerceiverIO Classifier\n\nWorking implementation of a Perceiver IO classifier in PyTorch by Eric Buehler.\n\n## PerceiverIO\nThe PerceiverIO used in this model is lucidrains's implemenation, found at https://github.com/lucidrains/perceiver-pytorch.\n\n[Explanation of PerceiverIO paper and architecture](https://www.youtube.com/watch?v=P_xeshTnPZg)\n## Training\nThe model was trained for 50 epochs. I used MSE (Mean Squared Error) loss for the criterion, an Adam optimizer (lr = 0.0001), and an ExponentialLR schedular with a gamma of 0.9.\nThe image size used is 28x28, with a color depth of 1 (black and white). The batch size for training is 64 and the corresponding test batch size is 1000.\n\nBy epoch 1 of training on the MNIST dataset, the model achieves 95% accuracy and a test loss of 0.007.\n\nBelow is an example of a test after epoch 1:\n\n![](images/image_epoch_0.png)\n\nAfter training for 50 epochs, the model achieves 98% accuracy, and a test loss of 0.002.\n\nBelow is an example of a test after epoch 50:\n\n![](images/image_epoch_49.png)\n\nAccuracy, and test/train loss plots are provided below:\n\n![](images/train_loss.png)\n\n![](images/test_loss.png)\n\n![](images/accuracy.png)\n\n## Usage\nThis model is designed to be trained on Google Colab. The use of a GPU is highly reccommended, and I used a P100.\n\nThe below code will create a \"Colab Notebooks/PercieverIO_Classifier\" directory in your Google Drive. \n```python\nmodelname=\"2_4_22_m1\"\n\n\ndrive.mount('/content/drive')\n\nprefix='/content/drive/MyDrive/Colab Notebooks/PercieverIO_Classifier/'\n\ntry:\n    os.mkdir(prefix)\nexcept FileExistsError:\n    pass\n\nprefix_images=prefix+'images/'\ntry:\n    os.mkdir(prefix_images)\nexcept FileExistsError:\n    pass\ntry:\n    os.mkdir(prefix_images+modelname+\"/\")\nexcept FileExistsError:\n    pass\n\nprefix_models=prefix+'models/'+modelname+\"/\"\ntry:\n    os.mkdir(prefix+'models/')\nexcept FileExistsError:\n    pass\ntry:\n    os.mkdir(prefix_models)\nexcept FileExistsError:\n    pass\n\nCPUonly=False\n```\n\nLine 2 in the above code mounts your Google Drive, and line 3 defines the prefix. Aside from the ```modelname``` definition and the ```CPUonly``` definition, the rest of the above code creates the required folders to hold the training progress.\nThe variable ```modelname``` is used to provide some automation of version control.\n\nTo run the training  code, open it in Google Colab. Set the modelname (or potentially prefix) to your own value and run the code. It will train the model for 50 epochs. Under the ```Important variables``` section in the code, you can find the definition of many simpler hyperparameters like the image size and the batch sizes. Under the ```Autoload/define model and setup criterion, optimizer, and scheduler```, and in the ```Define criterion, optimizer, and scheduler``` subsection, the loss, learning rate, optimizer, and scheduler are all defined.\n\n## Figures \nI have also attached a program to generate figures. Similarly to the training program, this program also requires that ```modelname``` and ```prefix``` be set to their appropriate values. This code will generate and save the figures for model ```modelname```, in the **prefix**/models/**modelname** directory.\n\n## Enviornment\nAll codes are designed to be run on Google Colab. It is highly recommended to run the train code with GPU, but not required. The figure generation code does not require a GPU.\n\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fericlbuehler%2Fperceiverio-classifier","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fericlbuehler%2Fperceiverio-classifier","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fericlbuehler%2Fperceiverio-classifier/lists"}