LG OC MLDec 1, 2019

Fast Stochastic Ordinal Embedding with Variance Reduction and Adaptive Step Size

Ke Ma, Jinshan Zeng, Qianqian Xu, Xiaochun Cao, Wei Liu, Yuan Yao

arXiv:1912.00362v11 citations

Originality Incremental advance

AI Analysis

This work addresses the problem of efficient large-scale ordinal embedding for machine learning practitioners, offering a faster alternative to SDP-based methods, though it is incremental in improving computational efficiency.

The paper tackles the scalability issue in ordinal embedding by proposing a stochastic algorithm (SVRG-SBB) that drops PSD constraints and uses adaptive step sizes, achieving an O(1/T) convergence rate and much lower computational cost with good performance compared to state-of-the-art methods.

Learning representation from relative similarity comparisons, often called ordinal embedding, gains rising attention in recent years. Most of the existing methods are based on semi-definite programming (\textit{SDP}), which is generally time-consuming and degrades the scalability, especially confronting large-scale data. To overcome this challenge, we propose a stochastic algorithm called \textit{SVRG-SBB}, which has the following features: i) achieving good scalability via dropping positive semi-definite (\textit{PSD}) constraints as serving a fast algorithm, i.e., stochastic variance reduced gradient (\textit{SVRG}) method, and ii) adaptive learning via introducing a new, adaptive step size called the stabilized Barzilai-Borwein (\textit{SBB}) step size. Theoretically, under some natural assumptions, we show the $\boldsymbol{O}(\frac{1}{T})$ rate of convergence to a stationary point of the proposed algorithm, where $T$ is the number of total iterations. Under the further Polyak-Łojasiewicz assumption, we can show the global linear convergence (i.e., exponentially fast converging to a global optimum) of the proposed algorithm. Numerous simulations and real-world data experiments are conducted to show the effectiveness of the proposed algorithm by comparing with the state-of-the-art methods, notably, much lower computational cost with good prediction performance.

View on arXiv PDF

Similar