LGOCMay 15, 2021

Bilevel Programs Meet Deep Learning: A Unifying View on Inference Learning Methods

arXiv:2105.07231v26 citations
AI Analysis

This work provides a theoretical framework for understanding alternative training algorithms in deep learning, which is incremental in unifying existing methods.

The paper unifies various inference learning methods by reformulating them as bilevel optimization programs, showing they all approximate error back-propagation, and introduces Fenchel back-propagation as a new method using finite targets.

In this work we unify a number of inference learning methods, that are proposed in the literature as alternative training algorithms to the ones based on regular error back-propagation. These inference learning methods were developed with very diverse motivations, mainly aiming to enhance the biological plausibility of deep neural networks and to improve the intrinsic parallelism of training methods. We show that these superficially very different methods can all be obtained by successively applying a particular reformulation of bilevel optimization programs. As a by-product it becomes also evident that all considered inference learning methods include back-propagation as a special case, and therefore at least approximate error back-propagation in typical settings. Finally, we propose Fenchel back-propagation, that replaces the propagation of infinitesimal corrections performed in standard back-propagation with finite targets as the learning signal. Fenchel back-propagation can therefore be seen as an instance of learning via explicit target propagation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes