LGMLFeb 19, 2020

NeuroFabric: Identifying Ideal Topologies for Training A Priori Sparse Networks

arXiv:2002.08339v1
AI Analysis

This work addresses the bottleneck of long training times in deep neural networks by providing a theoretical foundation for topology selection in sparse networks, which is an incremental but practical contribution for machine learning researchers focused on efficiency.

The paper tackles the problem of selecting optimal sparse topologies for training a priori sparse neural networks, which aim to reduce memory and compute requirements while maintaining high information bandwidth. The authors develop a data-free heuristic to evaluate topologies, derive requirements for good topologies, and identify a single topology that satisfies all criteria, showing that seemingly similar topologies can have large differences in attainable accuracy.

Long training times of deep neural networks are a bottleneck in machine learning research. The major impediment to fast training is the quadratic growth of both memory and compute requirements of dense and convolutional layers with respect to their information bandwidth. Recently, training `a priori' sparse networks has been proposed as a method for allowing layers to retain high information bandwidth, while keeping memory and compute low. However, the choice of which sparse topology should be used in these networks is unclear. In this work, we provide a theoretical foundation for the choice of intra-layer topology. First, we derive a new sparse neural network initialization scheme that allows us to explore the space of very deep sparse networks. Next, we evaluate several topologies and show that seemingly similar topologies can often have a large difference in attainable accuracy. To explain these differences, we develop a data-free heuristic that can evaluate a topology independently from the dataset the network will be trained on. We then derive a set of requirements that make a good topology, and arrive at a single topology that satisfies all of them.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes