Neural Architecture Search with Reinforcement Learning by Zoph, B., & Le, Q. V. (2017)

(Zoph and Le 2017)


This paper introduces the idea of using a RNN controller system to generate the operations of a neural network. In a first setting the authors use this method to construct CNNs. The controller samples an architecture, the architecture is built and trained and the controller is rewarded with the maximum validation accuracy of the last 5 epochs cubed (??).

Another experiment uses this exploration method to produce recurrent cell through a complicated model based on a tree of units, for each of which the controller samples an operation.


There is a lot of hyperparameters everywhere, given without anything else. Learning rates, weights, scores and random operations. Overall, the paper shows that their method can produce architecture that give comparable performance to human-designed ones. However, there is nothing said about wether a random search could do the same or better. The lesson is maybe that all those tasks are pretty much solved once you have the right operations (convolutions for images and recurrent cell for penn treebank and LM).

Search is really expensive (22400 GPU-hours, 800 GPU for 28 days).


Zoph, Barret, and Quoc V. Le. 2017. “Neural Architecture Search with Reinforcement Learning.” arXiv:1611.01578 [Cs], February.

← Back to Notes