CVNov 30, 2020

Inter-layer Transition in Neural Architecture Search

arXiv:2011.14525v15 citations
AI Analysis

This work provides an incremental improvement for researchers and practitioners working on differential Neural Architecture Search by introducing a more nuanced model of architectural dependencies.

This paper addresses the problem of inter-layer dependency in differential Neural Architecture Search (NAS) methods, where existing approaches treat architecture weights on each edge as independent. The proposed Inter-layer Transition NAS method models architecture optimization as a sequential decision process, explicitly accounting for dependencies between connected edges, and achieves state-of-the-art performance on five benchmarks.

Differential Neural Architecture Search (NAS) methods represent the network architecture as a repetitive proxy directed acyclic graph (DAG) and optimize the network weights and architecture weights alternatively in a differential manner. However, existing methods model the architecture weights on each edge (i.e., a layer in the network) as statistically independent variables, ignoring the dependency between edges in DAG induced by their directed topological connections. In this paper, we make the first attempt to investigate such dependency by proposing a novel Inter-layer Transition NAS method. It casts the architecture optimization into a sequential decision process where the dependency between the architecture weights of connected edges is explicitly modeled. Specifically, edges are divided into inner and outer groups according to whether or not their predecessor edges are in the same cell. While the architecture weights of outer edges are optimized independently, those of inner edges are derived sequentially based on the architecture weights of their predecessor edges and the learnable transition matrices in an attentive probability transition manner. Experiments on five benchmarks confirm the value of modeling inter-layer dependency and demonstrate the proposed method outperforms state-of-the-art methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes