{"id":13936474,"url":"https://github.com/ajbrock/SMASH","last_synced_at":"2025-07-19T22:30:23.569Z","repository":{"id":119150886,"uuid":"100662109","full_name":"ajbrock/SMASH","owner":"ajbrock","description":"An experimental technique for efficiently exploring neural architectures.","archived":false,"fork":false,"pushed_at":"2017-08-19T08:53:24.000Z","size":44,"stargazers_count":489,"open_issues_count":4,"forks_count":57,"subscribers_count":20,"default_branch":"master","last_synced_at":"2024-11-22T03:40:22.209Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ajbrock.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2017-08-18T01:57:17.000Z","updated_at":"2024-10-16T04:05:16.000Z","dependencies_parsed_at":"2023-07-10T00:45:57.616Z","dependency_job_id":null,"html_url":"https://github.com/ajbrock/SMASH","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ajbrock%2FSMASH","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ajbrock%2FSMASH/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ajbrock%2FSMASH/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ajbrock%2FSMASH/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ajbrock","download_url":"https://codeload.github.com/ajbrock/SMASH/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":226686722,"owners_count":17666928,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-07T23:02:42.656Z","updated_at":"2024-11-27T04:31:04.117Z","avatar_url":"https://github.com/ajbrock.png","language":"Python","readme":"# SMASH: One-Shot Model Architecture Search through HyperNetworks\nAn experimental technique for efficiently exploring neural architectures.\n\n![SMASHGIF](http://i.imgur.com/OTOvstW.gif)\n\nThis repository contains code for the SMASH [paper](https://arxiv.org/abs/1708.05344) and [video](https://www.youtube.com/watch?v=79tmPL9AL48). \n\nSMASH bypasses the need for fully training candidate models by learning an auxiliary HyperNet to approximate model weights, allowing for rapid comparison of a wide range of network architectures at the cost of a single training run.\n\n\n## Installation\nTo run this script, you will need [PyTorch](http://pytorch.org) and a CUDA-capable GPU. If you wish to run it on CPU, just remove all the .cuda() calls.\n\nNote that this code was written in PyTorch 0.12, and is not guaranteed to work on 0.2 until next week when I get a chance to update my own version. Please also be aware that, while thoroughly commented, this is research code for a heckishly complex project. I'll be doing more cleanup work to improve legibility soon.\n\n## Running\nTo run with default parameters, simply call\n\n```sh\npython train.py\n```\n\nThis will by default train a SMASH net with nominally the same parametric budget as a WRN-40-4.\nNote that validation scores during training are calculated using a random architecture for each batch, and are therefore sort of an \"average\" measure.\n\nAfter training, to sample and evaluate SMASH scores, call\n\n```sh\npython eval.py --SMASH=YOUR_MODEL_NAME_HERE_.pth\n```\n\nThis will by default sample 500 random architectures, then perturb the best-found architecture 100 times, then employ a sort of Markov Chain to further perturb the best found architecture.\n\nTo select the best architecture and train a resulting net, then call\n\n```sh\npython train.py --SMASH=YOUR_MODEL_NAME_HERE_archs.npz\n```\n\nThis will by default take the best architectuure\nThere are lots of different options, including a number of experimental settings such as architectural gradient descent by proxy, in-op multiplicative gating, variable nonlinearities, setting specific op configuration types. Take a look at the train_parser in utils.py for details, though note that some of these weirder ones may be deprecated. \n\nThis code has boilerplate for loading Imagenet32x32 and ModelNet, but doesn't download or preprocess them on its own. It supports model parallelism on a single node, and half-precision training, though simple weightnorm is unstable in FP16 so you probably can't train a SMASH network with it.\n## Notes\nThis README doc is in very early stages, and will be updated soon.\n\n## Acknowledgments\n- Training and Progress code acquired in a drunken game of SpearPong with Jan Schlüter: https://github.com/Lasagne/Recipes/tree/master/papers/densenet\n- Metrics Logging code extracted from ancient diary of Daniel Maturana: https://github.com/dimatura/voxnet\n\n","funding_links":[],"categories":["Python","Paper implementations｜论文实现","Paper implementations"],"sub_categories":["Other libraries｜其他库:","Other libraries:"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fajbrock%2FSMASH","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fajbrock%2FSMASH","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fajbrock%2FSMASH/lists"}