Partial-Hessian Strategies for Fast Learning of Nonlinear Embeddings
This addresses the computational bottleneck for researchers and practitioners using embedding algorithms, offering a scalable and efficient solution, though it is incremental as it builds on existing methods.
The paper tackles the slow training of stochastic neighbor embedding (SNE) and related nonlinear manifold learning algorithms by proposing partial-Hessian optimization strategies, achieving up to two orders of magnitude speedup with a spectral direction method that adds minimal overhead.
Stochastic neighbor embedding (SNE) and related nonlinear manifold learning algorithms achieve high-quality low-dimensional representations of similarity data, but are notoriously slow to train. We propose a generic formulation of embedding algorithms that includes SNE and other existing algorithms, and study their relation with spectral methods and graph Laplacians. This allows us to define several partial-Hessian optimization strategies, characterize their global and local convergence, and evaluate them empirically. We achieve up to two orders of magnitude speedup over existing training methods with a strategy (which we call the spectral direction) that adds nearly no overhead to the gradient and yet is simple, scalable and applicable to several existing and future embedding algorithms.