Maximally Informative Hierarchical Representations of High-Dimensional Data
This provides a principled and practical approach for unsupervised deep representation learning, which is incremental as it builds on existing bounds and optimization techniques.
The paper tackles the problem of quantifying and optimizing the informativeness of hierarchical representations in unsupervised learning, resulting in a method with linear computational complexity and constant sample complexity.
We consider a set of probabilistic functions of some input variables as a representation of the inputs. We present bounds on how informative a representation is about input data. We extend these bounds to hierarchical representations so that we can quantify the contribution of each layer towards capturing the information in the original data. The special form of these bounds leads to a simple, bottom-up optimization procedure to construct hierarchical representations that are also maximally informative about the data. This optimization has linear computational complexity and constant sample complexity in the number of variables. These results establish a new approach to unsupervised learning of deep representations that is both principled and practical. We demonstrate the usefulness of the approach on both synthetic and real-world data.