NA MLOct 1, 2016

Convergence of a Grassmannian Gradient Descent Algorithm for Subspace Estimation From Undersampled Data

arXiv:1610.00199v35.13 citations

Originality Incremental advance

AI Analysis

This work addresses efficient subspace learning for large datasets, but it is incremental as it builds on existing non-convex optimization methods with specific convergence guarantees.

The paper tackles the problem of subspace estimation from streaming data, proposing a Grassmannian gradient descent algorithm with an adaptive stepsize scheme, and proves convergence to the true subspace for fully sampled data and monotonic expected improvement for undersampled data.

Subspace learning and matrix factorization problems have great many applications in science and engineering, and efficient algorithms are critical as dataset sizes continue to grow. Many relevant problem formulations are non-convex, and in a variety of contexts it has been observed that solving the non-convex problem directly is not only efficient but reliably accurate. We discuss convergence theory for a particular method: first order incremental gradient descent constrained to the Grassmannian. The output of the algorithm is an orthonormal basis for a $d$-dimensional subspace spanned by an input streaming data matrix. We study two sampling cases: where each data vector of the streaming matrix is fully sampled, or where it is undersampled by a sampling matrix $A_t\in \mathbb{R}^{m\times n}$ with $m\ll n$. Our results cover two cases, where $A_t$ is Gaussian or a subset of rows of the identity matrix. We propose an adaptive stepsize scheme that depends only on the sampled data and algorithm outputs. We prove that with fully sampled data, the stepsize scheme maximizes the improvement of our convergence metric at each iteration, and this method converges from any random initialization to the true subspace, despite the non-convex formulation and orthogonality constraints. For the case of undersampled data, we establish monotonic expected improvement on the defined convergence metric for each iteration with high probability.

View on arXiv PDF

Similar