{"id":13452811,"url":"https://github.com/quark0/darts","last_synced_at":"2025-05-15T02:03:31.655Z","repository":{"id":41067454,"uuid":"138547383","full_name":"quark0/darts","owner":"quark0","description":"Differentiable architecture search for convolutional and recurrent networks","archived":false,"fork":false,"pushed_at":"2021-01-03T02:21:40.000Z","size":4927,"stargazers_count":3949,"open_issues_count":97,"forks_count":840,"subscribers_count":86,"default_branch":"master","last_synced_at":"2025-04-13T23:54:10.099Z","etag":null,"topics":["automl","convolutional-networks","deep-learning","image-classification","language-modeling","neural-architecture-search","pytorch","recurrent-networks"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/1806.09055","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/quark0.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-06-25T05:27:29.000Z","updated_at":"2025-04-13T13:37:04.000Z","dependencies_parsed_at":"2022-08-03T08:30:58.832Z","dependency_job_id":null,"html_url":"https://github.com/quark0/darts","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quark0%2Fdarts","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quark0%2Fdarts/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quark0%2Fdarts/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quark0%2Fdarts/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/quark0","download_url":"https://codeload.github.com/quark0/darts/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254259369,"owners_count":22040819,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["automl","convolutional-networks","deep-learning","image-classification","language-modeling","neural-architecture-search","pytorch","recurrent-networks"],"created_at":"2024-07-31T08:00:23.432Z","updated_at":"2025-05-15T02:03:31.583Z","avatar_url":"https://github.com/quark0.png","language":"Python","funding_links":[],"categories":["Automated Deep Learning","Papers\u0026Codes","Frameworks and libraries","DLA","Neural Architecture Search (NAS)","1.) Neural Architecture Search","Python","Projects"],"sub_categories":["Gradient-based Optimization","DARTS",":snake: Python","**[Papers]**","Distributed Frameworks"],"readme":"# Differentiable Architecture Search\nCode accompanying the paper\n\u003e [DARTS: Differentiable Architecture Search](https://arxiv.org/abs/1806.09055)\\\n\u003e Hanxiao Liu, Karen Simonyan, Yiming Yang.\\\n\u003e _arXiv:1806.09055_.\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"img/darts.png\" alt=\"darts\" width=\"48%\"\u003e\n\u003c/p\u003e\nThe algorithm is based on continuous relaxation and gradient descent in the architecture space. It is able to efficiently design high-performance convolutional architectures for image classification (on CIFAR-10 and ImageNet) and recurrent architectures for language modeling (on Penn Treebank and WikiText-2). Only a single GPU is required.\n\n## Requirements\n```\nPython \u003e= 3.5.5, PyTorch == 0.3.1, torchvision == 0.2.0\n```\nNOTE: PyTorch 0.4 is not supported at this moment and would lead to OOM.\n\n## Datasets\nInstructions for acquiring PTB and WT2 can be found [here](https://github.com/salesforce/awd-lstm-lm). While CIFAR-10 can be automatically downloaded by torchvision, ImageNet needs to be manually downloaded (preferably to a SSD) following the instructions [here](https://github.com/pytorch/examples/tree/master/imagenet).\n\n## Pretrained models\nThe easist way to get started is to evaluate our pretrained DARTS models.\n\n**CIFAR-10** ([cifar10_model.pt](https://drive.google.com/file/d/1Y13i4zKGKgjtWBdC0HWLavjO7wvEiGOc/view?usp=sharing))\n```\ncd cnn \u0026\u0026 python test.py --auxiliary --model_path cifar10_model.pt\n```\n* Expected result: 2.63% test error rate with 3.3M model params.\n\n**PTB** ([ptb_model.pt](https://drive.google.com/file/d/1Mt_o6fZOlG-VDF3Q5ModgnAJ9W6f_av2/view?usp=sharing))\n```\ncd rnn \u0026\u0026 python test.py --model_path ptb_model.pt\n```\n* Expected result: 55.68 test perplexity with 23M model params.\n\n**ImageNet** ([imagenet_model.pt](https://drive.google.com/file/d/1AKr6Y_PoYj7j0Upggyzc26W0RVdg4CVX/view?usp=sharing))\n```\ncd cnn \u0026\u0026 python test_imagenet.py --auxiliary --model_path imagenet_model.pt\n```\n* Expected result: 26.7% top-1 error and 8.7% top-5 error with 4.7M model params.\n\n## Architecture search (using small proxy models)\nTo carry out architecture search using 2nd-order approximation, run\n```\ncd cnn \u0026\u0026 python train_search.py --unrolled     # for conv cells on CIFAR-10\ncd rnn \u0026\u0026 python train_search.py --unrolled     # for recurrent cells on PTB\n```\nNote the _validation performance in this step does not indicate the final performance of the architecture_. One must train the obtained genotype/architecture from scratch using full-sized models, as described in the next section.\n\nAlso be aware that different runs would end up with different local minimum. To get the best result, it is crucial to repeat the search process with different seeds and select the best cell(s) based on validation performance (obtained by training the derived cell from scratch for a small number of epochs). Please refer to fig. 3 and sect. 3.2 in our arXiv paper.\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"img/progress_convolutional_normal.gif\" alt=\"progress_convolutional_normal\" width=\"29%\"\u003e\n\u003cimg src=\"img/progress_convolutional_reduce.gif\" alt=\"progress_convolutional_reduce\" width=\"35%\"\u003e\n\u003cimg src=\"img/progress_recurrent.gif\" alt=\"progress_recurrent\" width=\"33%\"\u003e\n\u003c/p\u003e\n\u003cp align=\"center\"\u003e\nFigure: Snapshots of the most likely normal conv, reduction conv, and recurrent cells over time.\n\u003c/p\u003e\n\n## Architecture evaluation (using full-sized models)\nTo evaluate our best cells by training from scratch, run\n```\ncd cnn \u0026\u0026 python train.py --auxiliary --cutout            # CIFAR-10\ncd rnn \u0026\u0026 python train.py                                 # PTB\ncd rnn \u0026\u0026 python train.py --data ../data/wikitext-2 \\     # WT2\n            --dropouth 0.15 --emsize 700 --nhidlast 700 --nhid 700 --wdecay 5e-7\ncd cnn \u0026\u0026 python train_imagenet.py --auxiliary            # ImageNet\n```\nCustomized architectures are supported through the `--arch` flag once specified in `genotypes.py`.\n\nThe CIFAR-10 result at the end of training is subject to variance due to the non-determinism of cuDNN back-prop kernels. _It would be misleading to report the result of only a single run_. By training our best cell from scratch, one should expect the average test error of 10 independent runs to fall in the range of 2.76 +/- 0.09% with high probability.\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"img/cifar10.png\" alt=\"cifar10\" width=\"36%\"\u003e\n\u003cimg src=\"img/imagenet.png\" alt=\"ptb\" width=\"29%\"\u003e\n\u003cimg src=\"img/ptb.png\" alt=\"ptb\" width=\"30%\"\u003e\n\u003c/p\u003e\n\u003cp align=\"center\"\u003e\nFigure: Expected learning curves on CIFAR-10 (4 runs), ImageNet and PTB.\n\u003c/p\u003e\n\n## Visualization\nPackage [graphviz](https://graphviz.readthedocs.io/en/stable/index.html) is required to visualize the learned cells\n```\npython visualize.py DARTS\n```\nwhere `DARTS` can be replaced by any customized architectures in `genotypes.py`.\n\n## Citation\nIf you use any part of this code in your research, please cite our [paper](https://arxiv.org/abs/1806.09055):\n```\n@article{liu2018darts,\n  title={DARTS: Differentiable Architecture Search},\n  author={Liu, Hanxiao and Simonyan, Karen and Yang, Yiming},\n  journal={arXiv preprint arXiv:1806.09055},\n  year={2018}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fquark0%2Fdarts","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fquark0%2Fdarts","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fquark0%2Fdarts/lists"}