LG OC MLFeb 1, 2025

Efficient Over-parameterized Matrix Sensing from Noisy Measurements via Alternating Preconditioned Gradient Descent

Zhiyu Liu, Zhi Han, Yandong Tang, Shaojie Tang, Yao Wang

arXiv:2502.00463v313.05 citationsh-index: 6

Originality Incremental advance

AI Analysis

This work addresses the challenge of tuning sensitivity in preconditioning methods for low-rank matrix estimation, offering a more efficient solution for researchers and practitioners in machine learning and signal processing.

The paper tackles the noisy matrix sensing problem in an over-parameterized setting by proposing the alternating preconditioned gradient descent (APGD) algorithm, which eliminates the need for a damping parameter and allows larger step sizes, achieving near-optimal error with linear convergence and faster computation times in experiments.

We consider the noisy matrix sensing problem in the over-parameterization setting, where the estimated rank $r$ is larger than the true rank $r_\star$ of the target matrix $X_\star$. Specifically, our main objective is to recover a matrix $ X_\star \in \mathbb{R}^{n_1 \times n_2} $ with rank $ r_\star $ from noisy measurements using an over-parameterized factorization $ LR^\top $, where $ L \in \mathbb{R}^{n_1 \times r}, \, R \in \mathbb{R}^{n_2 \times r} $ and $ \min\{n_1, n_2\} \ge r > r_\star $, with $ r_\star $ being unknown. Recently, preconditioning methods have been proposed to accelerate the convergence of matrix sensing problem compared to vanilla gradient descent, incorporating preconditioning terms $ (L^\top L + λI)^{-1} $ and $ (R^\top R + λI)^{-1} $ into the original gradient. However, these methods require careful tuning of the damping parameter $λ$ and are sensitive to step size. To address these limitations, we propose the alternating preconditioned gradient descent (APGD) algorithm, which alternately updates the two factor matrices, eliminating the need for the damping parameter $λ$ and enabling faster convergence with larger step sizes. We theoretically prove that APGD convergences to a near-optimal error at a linear rate. We further show that APGD can be extended to deal with other low-rank matrix estimation tasks, also with a theoretical guarantee of linear convergence. To validate the effectiveness and scalability of the proposed APGD, we conduct simulated and real-world experiments on a wide range of low-rank estimation problems, including noisy matrix sensing, weighted PCA, 1-bit matrix completion, and matrix completion. The extensive results demonstrate that APGD consistently achieves the fastest convergence and the lowest computation time compared to the existing alternatives.

View on arXiv PDF

Similar