LG MLDec 12, 2017

Transportation analysis of denoising autoencoders: a novel method for analyzing deep neural networks

arXiv:1712.04145v12.63 citations

Originality Incremental advance

AI Analysis

This provides a theoretical framework for understanding deep neural networks, which are often analytically unexplained, though it is incremental as it builds on existing concepts like transportation maps.

The paper tackled the problem of analyzing deep neural networks by investigating denoising autoencoders (DAEs) through transportation dynamics, revealing that infinitely deep DAEs transport mass to decrease quantities like entropy in data distributions, linking them to Wasserstein gradient flows.

The feature map obtained from the denoising autoencoder (DAE) is investigated by determining transportation dynamics of the DAE, which is a cornerstone for deep learning. Despite the rapid development in its application, deep neural networks remain analytically unexplained, because the feature maps are nested and parameters are not faithful. In this paper, we address the problem of the formulation of nested complex of parameters by regarding the feature map as a transport map. Even when a feature map has different dimensions between input and output, we can regard it as a transportation map by considering that both the input and output spaces are embedded in a common high-dimensional space. In addition, the trajectory is a geometric object and thus, is independent of parameterization. In this manner, transportation can be regarded as a universal character of deep neural networks. By determining and analyzing the transportation dynamics, we can understand the behavior of a deep neural network. In this paper, we investigate a fundamental case of deep neural networks: the DAE. We derive the transport map of the DAE, and reveal that the infinitely deep DAE transports mass to decrease a certain quantity, such as entropy, of the data distribution. These results though analytically simple, shed light on the correspondence between deep neural networks and the Wasserstein gradient flows.

View on arXiv PDF

Similar