LG AINov 9, 2021

Leveraging the Graph Structure of Neural Network Training Dynamics

Fatemeh Vahedian, Ruiyu Li, Puja Trivedi, Di Jin, Danai Koutra

arXiv:2111.05410v25.53 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the need for more efficient training and performance prediction in deep learning, though it appears incremental as it builds on existing graph-based methods.

The authors tackled the problem of understanding neural network training dynamics by proposing a temporal graph framework that captures changes over training epochs, demonstrating it can predict final task performance from early epochs (<5) with accuracy across four architectures and two datasets.

Understanding the training dynamics of deep neural networks (DNNs) is important as it can lead to improved training efficiency and task performance. Recent works have demonstrated that representing the wirings of static graph cannot capture how DNNs change over the course of training. Thus, in this work, we propose a compact, expressive temporal graph framework that effectively captures the dynamics of many workhorse architectures in computer vision. Specifically, it extracts an informative summary of graph properties (e.g., eigenvector centrality) over a sequence of DNN graphs obtained during training. We demonstrate that our framework captures useful dynamics by accurately predicting trained, task performance when using a summary over early training epochs (<5) across four different architectures and two image datasets. Moreover, by using a novel, highly-scalable DNN graph representation, we also show that the proposed framework captures generalizable dynamics as summaries extracted from smaller-width networks are effective when evaluated on larger widths.

View on arXiv PDF Code

Similar