Median Matrix Completion: from Embarrassment to Optimality
This work addresses a computational bottleneck in matrix completion for large-scale data, offering an incremental improvement over existing methods.
The paper tackles the computational challenge of median matrix completion for large-scale datasets by proposing a novel refinement step that converts inefficient parallel estimators into a rate (near-)optimal procedure, with empirical results confirming its effectiveness.
In this paper, we consider matrix completion with absolute deviation loss and obtain an estimator of the median matrix. Despite several appealing properties of median, the non-smooth absolute deviation loss leads to computational challenge for large-scale data sets which are increasingly common among matrix completion problems. A simple solution to large-scale problems is parallel computing. However, embarrassingly parallel fashion often leads to inefficient estimators. Based on the idea of pseudo data, we propose a novel refinement step, which turns such inefficient estimators into a rate (near-)optimal matrix completion procedure. The refined estimator is an approximation of a regularized least median estimator, and therefore not an ordinary regularized empirical risk estimator. This leads to a non-standard analysis of asymptotic behaviors. Empirical results are also provided to confirm the effectiveness of the proposed method.