Matrix Co-completion for Multi-label Classification with Missing Features and Labels
This work addresses a challenging multi-label classification scenario with missing data, offering a novel approach that could benefit applications in domains like text or image tagging, though it appears incremental by building on matrix completion techniques.
The paper tackles the problem of multi-label classification with missing features and labels by proposing a co-completion algorithm that assumes a latent matrix with low-rank structure, addressing limitations of existing linear methods and achieving improved performance in experiments.
We consider a challenging multi-label classification problem where both feature matrix $\X$ and label matrix $\Y$ have missing entries. An existing method concatenated $\X$ and $\Y$ as $[\X; \Y]$ and applied a matrix completion (MC) method to fill the missing entries, under the assumption that $[\X; \Y]$ is of low-rank. However, since entries of $\Y$ take binary values in the multi-label setting, it is unlikely that $\Y$ is of low-rank. Moreover, such assumption implies a linear relationship between $\X$ and $\Y$ which may not hold in practice. In this paper, we consider a latent matrix $\Z$ that produces the probability $σ(Z_{ij})$ of generating label $Y_{ij}$, where $σ(\cdot)$ is nonlinear. Considering label correlation, we assume $[\X; \Z]$ is of low-rank, and propose an MC algorithm based on subgradient descent named co-completion (COCO) motivated by elastic net and one-bit MC. We give a theoretical bound on the recovery effect of COCO and demonstrate its practical usefulness through experiments.