ML LG CAJun 25, 2015

Diffusion Nets

Gal Mishne, Uri Shaham, Alexander Cloninger, Israel Cohen

arXiv:1506.07840v118.463 citationsh-index: 50

Originality Incremental advance

AI Analysis

This addresses the need for efficient manifold learning methods in high-dimensional data analysis, offering improvements for tasks like outlier detection, though it appears incremental as it builds on existing autoencoder and manifold learning concepts.

The paper tackles the problem of out-of-sample-extension in non-linear manifold learning by proposing a deep learning-based autoencoder called diffusion net, which efficiently maps high-dimensional data to low-dimensional embeddings and back, with proven convergence rates and reduced computational and memory requirements compared to previous methods.

Non-linear manifold learning enables high-dimensional data analysis, but requires out-of-sample-extension methods to process new data points. In this paper, we propose a manifold learning algorithm based on deep learning to create an encoder, which maps a high-dimensional dataset and its low-dimensional embedding, and a decoder, which takes the embedded data back to the high-dimensional space. Stacking the encoder and decoder together constructs an autoencoder, which we term a diffusion net, that performs out-of-sample-extension as well as outlier detection. We introduce new neural net constraints for the encoder, which preserves the local geometry of the points, and we prove rates of convergence for the encoder. Also, our approach is efficient in both computational complexity and memory requirements, as opposed to previous methods that require storage of all training points in both the high-dimensional and the low-dimensional spaces to calculate the out-of-sample-extension and the pre-image.

View on arXiv PDF

Similar