OCLGMLDec 17, 2016

Mutual information for fitting deep nonlinear models

arXiv:1612.05708v11 citations
Originality Synthesis-oriented
AI Analysis

This addresses a fundamental problem in deep learning for researchers and practitioners, but it appears incremental as it builds on existing information-theoretic concepts without major breakthroughs.

The authors tackled the challenge of fitting deep nonlinear models without knowledge of hidden layers by using mutual information and KL divergence as objective functions, finding mutual information largely successful depending on parameters and KL divergence similarly effective with some hidden layer statistics.

Deep nonlinear models pose a challenge for fitting parameters due to lack of knowledge of the hidden layer and the potentially non-affine relation of the initial and observed layers. In the present work we investigate the use of information theoretic measures such as mutual information and Kullback-Leibler (KL) divergence as objective functions for fitting such models without knowledge of the hidden layer. We investigate one model as a proof of concept and one application of cogntive performance. We further investigate the use of optimizers with these methods. Mutual information is largely successful as an objective, depending on the parameters. KL divergence is found to be similarly succesful, given some knowledge of the statistics of the hidden layer.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes