LGAINEMLMay 18, 2018

Dynamic learning rate using Mutual Information

arXiv:1805.07249v2
AI Analysis

This addresses hyper-parameter tuning for deep learning practitioners, but it is incremental as it applies an existing concept (mutual information) to a known bottleneck (learning rate setting).

The paper tackles the problem of setting learning rates in deep neural network training by using mutual information between the output layer and true outcomes to dynamically adjust the learning rate, achieving competitive to better outcomes in training time.

This paper demonstrates dynamic hyper-parameter setting, for deep neural network training, using Mutual Information (MI). The specific hyper-parameter studied in this paper is the learning rate. MI between the output layer and true outcomes is used to dynamically set the learning rate of the network through the training cycle; the idea is also extended to layer-wise setting of learning rate. Two approaches are demonstrated - tracking relative change in mutual information and, additionally tracking its value relative to a reference measure. The paper does not attempt to recommend a specific learning rate policy. Experiments demonstrate that mutual information may be effectively used to dynamically set learning rate and achieve competitive to better outcomes in competitive to better time.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes