Scalable Out-of-Sample Extension of Graph Embeddings Using Deep Neural Networks
This addresses a scalability bottleneck for practitioners using graph embeddings in machine learning, though it is incremental as it builds on existing embedding techniques.
The paper tackles the problem of computationally expensive out-of-sample extension for graph embeddings by using deep neural networks to approximate nonlinear maps, resulting in equal or better fidelity and orders of magnitude less computation at test time.
Several popular graph embedding techniques for representation learning and dimensionality reduction rely on performing computationally expensive eigendecompositions to derive a nonlinear transformation of the input data space. The resulting eigenvectors encode the embedding coordinates for the training samples only, and so the embedding of novel data samples requires further costly computation. In this paper, we present a method for the out-of-sample extension of graph embeddings using deep neural networks (DNN) to parametrically approximate these nonlinear maps. Compared with traditional nonparametric out-of-sample extension methods, we demonstrate that the DNNs can generalize with equal or better fidelity and require orders of magnitude less computation at test time. Moreover, we find that unsupervised pretraining of the DNNs improves optimization for larger network sizes, thus removing sensitivity to model selection.