MLCVLGOct 2, 2019

How does topology influence gradient propagation and model performance of deep networks with DenseNet-type skip connections?

arXiv:1910.00780v37 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficiently designing and optimizing deep neural networks for computer vision tasks, offering a method to identify high-performing models without extensive training, though it is incremental in building on existing skip connection architectures.

The paper tackles the problem of understanding how skip connection topology affects gradient propagation and model performance in deep networks, revealing that a new metric called NN-Mass can predict test performance and enable the design of compressed models, such as achieving similar accuracy with significantly reduced size/compute requirements on datasets like CIFAR-10 and ImageNet.

DenseNets introduce concatenation-type skip connections that achieve state-of-the-art accuracy in several computer vision tasks. In this paper, we reveal that the topology of the concatenation-type skip connections is closely related to the gradient propagation which, in turn, enables a predictable behavior of DNNs' test performance. To this end, we introduce a new metric called NN-Mass to quantify how effectively information flows through DNNs. Moreover, we empirically show that NN-Mass also works for other types of skip connections, e.g., for ResNets, Wide-ResNets (WRNs), and MobileNets, which contain addition-type skip connections (i.e., residuals or inverted residuals). As such, for both DenseNet-like CNNs and ResNets/WRNs/MobileNets, our theoretically grounded NN-Mass can identify models with similar accuracy, despite having significantly different size/compute requirements. Detailed experiments on both synthetic and real datasets (e.g., MNIST, CIFAR-10, CIFAR-100, ImageNet) provide extensive evidence for our insights. Finally, the closed-form equation of our NN-Mass enables us to design significantly compressed DenseNets (for CIFAR-10) and MobileNets (for ImageNet) directly at initialization without time-consuming training and/or searching.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes