LGOCMar 31, 2023

Analysis and Comparison of Two-Level KFAC Methods for Training Deep Neural Networks

arXiv:2303.18083v2h-index: 12
Originality Synthesis-oriented
AI Analysis

This work addresses computational efficiency in training deep neural networks for researchers and practitioners, but it is incremental as it validates an existing approximation rather than introducing a new method.

The study investigated whether adding low-frequency interactions between layers via two-level corrections improves KFAC, a second-order optimization method for deep neural networks, and found that it does not significantly enhance performance, confirming the robustness of the block-diagonal approximation.

As a second-order method, the Natural Gradient Descent (NGD) has the ability to accelerate training of neural networks. However, due to the prohibitive computational and memory costs of computing and inverting the Fisher Information Matrix (FIM), efficient approximations are necessary to make NGD scalable to Deep Neural Networks (DNNs). Many such approximations have been attempted. The most sophisticated of these is KFAC, which approximates the FIM as a block-diagonal matrix, where each block corresponds to a layer of the neural network. By doing so, KFAC ignores the interactions between different layers. In this work, we investigate the interest of restoring some low-frequency interactions between the layers by means of two-level methods. Inspired from domain decomposition, several two-level corrections to KFAC using different coarse spaces are proposed and assessed. The obtained results show that incorporating the layer interactions in this fashion does not really improve the performance of KFAC. This suggests that it is safe to discard the off-diagonal blocks of the FIM, since the block-diagonal approach is sufficiently robust, accurate and economical in computation time.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes