Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/songzhaozhe/Model-Architecture-Search-Paper-List
A summary of my recently surveyed papers. Some papers on Arxiv with unimpressive results are not included.
https://github.com/songzhaozhe/Model-Architecture-Search-Paper-List
Last synced: 3 months ago
JSON representation
A summary of my recently surveyed papers. Some papers on Arxiv with unimpressive results are not included.
- Host: GitHub
- URL: https://github.com/songzhaozhe/Model-Architecture-Search-Paper-List
- Owner: songzhaozhe
- Created: 2018-01-26T10:00:06.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2018-04-18T10:50:31.000Z (over 6 years ago)
- Last Synced: 2024-08-01T08:06:42.096Z (5 months ago)
- Homepage:
- Size: 2.93 KB
- Stars: 24
- Watchers: 6
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-automated-machine-learning - songzhaozhe/Model-Architecture-Search-Paper-List
README
# Model Architecture Search Paper List
> Last updated 12/31/2017
>
> Zhaozhe Song## Two early papers in ICLR 2017
##### **NASNet (Google)**
Zoph B, Le Q V. Neural architecture search with reinforcement learning[J]. arXiv preprint arXiv:1611.01578, 2016. [[pdf]](https://arxiv.org/pdf/1611.01578.pdf)
> Sample architectures and train, by policy gradient (RL). On CIFAR only.
**MetaQNN**
Baker B, Gupta O, Naik N, et al. Designing neural network architectures using reinforcement learning[J]. arXiv preprint arXiv:1611.02167, 2016. [[pdf]](https://arxiv.org/pdf/1611.02167.pdf)
> Use Q-learning (RL) to choose the next cell (conv/pool)
## State-of-the-art Papers
**NASNet-A (Google)**
Zoph B, Vasudevan V, Shlens J, et al. Learning transferable architectures for scalable image recognition[J]. arXiv preprint arXiv:1707.07012, 2017. [[pdf]](https://arxiv.org/pdf/1707.07012.pdf)
> Modified version of NASNet, used RNN + PPO (RL) to train on CIFAR, and transfer the same cell to ImageNet.
**Hierarchical Representations (Google)**
Liu H, Simonyan K, Vinyals O, et al. Hierarchical representations for efficient architecture search[J]. arXiv preprint arXiv:1711.00436, 2017. [[pdf]](https://arxiv.org/pdf/1711.00436.pdf)
> An interesting hierarchical search space + Evolution algorithm.
>
> Slightly worse than the other two.**PNASNet (Google)**
Liu C, Zoph B, Shlens J, et al. Progressive neural architecture search[J]. arXiv preprint arXiv:1712.00559, 2017. [[pdf]](https://arxiv.org/pdf/1712.00559.pdf)
> Used sequential model-based optimization (SMBO) (similar to A* algorithm). Search from simple to complex cells, also train a surrogate function to predict performance in the beginning of next step.
>
> Same accuracy, 2 times faster than NASNet-A algorithm.**AmoebaNets (Google)**
Regularized Evolution for Image Classifier Architecture Search [[pdf]](https://arxiv.org/pdf/1802.01548.pdf)
> Same search space with NASNet-A, a more detailed comparison of search methods. Claimed that evolution is better than RL.
>
> Introduced regularized evolution, which is a simple modification to traditional evolution algorithm.## Other papers
Real, Esteban, et al. "Large-scale evolution of image classifiers." arXiv preprint arXiv:1703.01041 (2017).
> Evolution algorithm. Inherit most weights after every mutation.
Sun, Yanan, Bing Xue, and Mengjie Zhang. "Evolving Deep Convolutional Neural Networks for Image Classification." *arXiv preprint arXiv:1710.10741* (2017).
> Genetic algorithm (Encoded networks into chromosomes).
Cai, Han, et al. "Reinforcement learning for architecture search by network transformation." *arXiv preprint arXiv:1707.04873* (2017).
> Net2Net idea + RL
Elsken, Thomas, Jan-Hendrik Metzen, and Frank Hutter. "Simple And Efficient Architecture Search for Convolutional Neural Networks." arXiv preprint arXiv:1711.04528 (2017).
> Network morphism (generalized Net2Net)
Zhong Z, Yan J, Liu C L. Practical Network Blocks Design with Q-Learning[J]. arXiv preprint arXiv:1708.05552, 2017.
> Q-learning + performance prediction.
Baker, Bowen, et al. "Accelerating neural architecture search using performance prediction." CoRR, abs/1705.10823(2017).
> Predict performance by early training stats and network architecture
Brock, Andrew, et al. "SMASH: one-shot model architecture search through hypernetworks." arXiv preprint arXiv:1708.05344 (2017).
> Fancy idea: Train a HyperNetwork to predict weights for every architecture.
>
> But I did not replicate their results using their released code.## Some other objectives
#### Searching for LSTM cell structure
Jozefowicz, Rafal, Wojciech Zaremba, and Ilya Sutskever. "An empirical exploration of recurrent network architectures." ICML 2015.
#### Searching Activation Functions
Ramachandran, P., Zoph, B., & Le, Q. (2017). Searching for activation functions.
> Discovered Swish (x * sigmoid(x)) activation function through Brute-Force search. It is able to boost 1~2% classification accuracy on ImageNet, in my experiments.