LG MLMar 31, 2019

Understanding Neural Architecture Search Techniques

arXiv:1904.00438v216.346 citations

Originality Incremental advance

AI Analysis

This work addresses a critical failure mode in neural architecture search for researchers and practitioners, but it is incremental as it builds on existing ENAS methods.

The paper tackles the problem that Efficient Neural Architecture Search (ENAS) does not perform significantly better than random search with weight sharing, contradicting prior claims, and finds that the RNN controller fails to condition on past architecture choices, with a proposed solution improving hidden state interpretability by increasing correlation with graph similarity metrics.

Automatic methods for generating state-of-the-art neural network architectures without human experts have generated significant attention recently. This is because of the potential to remove human experts from the design loop which can reduce costs and decrease time to model deployment. Neural architecture search (NAS) techniques have improved significantly in their computational efficiency since the original NAS was proposed. This reduction in computation is enabled via weight sharing such as in Efficient Neural Architecture Search (ENAS). However, recently a body of work confirms our discovery that ENAS does not do significantly better than random search with weight sharing, contradicting the initial claims of the authors. We provide an explanation for this phenomenon by investigating the interpretability of the ENAS controller's hidden state. We find models sampled from identical controller hidden states have no correlation with various graph similarity metrics, so no notion of structural similarity is learned. This failure mode implies the RNN controller does not condition on past architecture choices. Lastly, we propose a solution to this failure mode by forcing the controller's hidden state to encode pasts decisions by training it with a memory buffer of previously sampled architectures. Doing this improves hidden state interpretability by increasing the correlation between controller hidden states and graph similarity metrics.

View on arXiv PDF

Similar