LGMay 11, 2022

Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis

arXiv:2205.05662v29 citationsh-index: 81Has Code
Originality Incremental advance
AI Analysis

This work provides a theoretical analysis for improving neural architecture search efficiency, but it is incremental as it builds on existing NNGP methods.

The authors tackled the problem of understanding how connectivity patterns in deep neural networks affect convergence, showing that filtering unpromising patterns can significantly accelerate neural architecture search.

Advanced deep neural networks (DNNs), designed by either human or AutoML algorithms, are growing increasingly complex. Diverse operations are connected by complicated connectivity patterns, e.g., various types of skip connections. Those topological compositions are empirically effective and observed to smooth the loss landscape and facilitate the gradient flow in general. However, it remains elusive to derive any principled understanding of their effects on the DNN capacity or trainability, and to understand why or in which aspect one specific connectivity pattern is better than another. In this work, we theoretically characterize the impact of connectivity patterns on the convergence of DNNs under gradient descent training in fine granularity. By analyzing a wide network's Neural Network Gaussian Process (NNGP), we are able to depict how the spectrum of an NNGP kernel propagates through a particular connectivity pattern, and how that affects the bound of convergence rates. As one practical implication of our results, we show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate, and significantly accelerate the large-scale neural architecture search without any overhead. Code is available at: https://github.com/VITA-Group/architecture_convergence.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes