Collective Matrix Completion
This addresses the challenge of handling multi-source data in matrix completion, which is incremental as it extends single-matrix methods to collective settings.
The paper tackles the problem of collective matrix completion for multiple heterogeneous data matrices, proposing estimators that minimize a goodness-of-fit term plus nuclear norm penalization, and proves they achieve fast convergence rates in both exponential family and distribution-free settings, supported by numerical experiments.
Matrix completion aims to reconstruct a data matrix based on observations of a small number of its entries. Usually in matrix completion a single matrix is considered, which can be, for example, a rating matrix in recommendation system. However, in practical situations, data is often obtained from multiple sources which results in a collection of matrices rather than a single one. In this work, we consider the problem of collective matrix completion with multiple and heterogeneous matrices, which can be count, binary, continuous, etc. We first investigate the setting where, for each source, the matrix entries are sampled from an exponential family distribution. Then, we relax the assumption of exponential family distribution for the noise and we investigate the distribution-free case. In this setting, we do not assume any specific model for the observations. The estimation procedures are based on minimizing the sum of a goodness-of-fit term and the nuclear norm penalization of the whole collective matrix. We prove that the proposed estimators achieve fast rates of convergence under the two considered settings and we corroborate our results with numerical experiments.