Deep Collective Matrix Factorization for Augmented Multi-View Learning
This addresses the need for better multi-view learning in domains like bioinformatics and recommendation systems, though it is incremental as it extends CMF with deep learning.
The paper tackled the problem of integrating multiple heterogeneous data sources by developing dCMF, a deep-learning based method for unsupervised learning of shared representations from arbitrary matrix collections, which significantly outperformed previous CMF algorithms and state-of-the-art matrix completion methods in tasks like recommendation and gene-disease association prediction.
Learning by integrating multiple heterogeneous data sources is a common requirement in many tasks. Collective Matrix Factorization (CMF) is a technique to learn shared latent representations from arbitrary collections of matrices. It can be used to simultaneously complete one or more matrices, for predicting the unknown entries. Classical CMF methods assume linearity in the interaction of latent factors which can be restrictive and fails to capture complex non-linear interactions. In this paper, we develop the first deep-learning based method, called dCMF, for unsupervised learning of multiple shared representations, that can model such non-linear interactions, from an arbitrary collection of matrices. We address optimization challenges that arise due to dependencies between shared representations through Multi-Task Bayesian Optimization and design an acquisition function adapted for collective learning of hyperparameters. Our experiments show that dCMF significantly outperforms previous CMF algorithms in integrating heterogeneous data for predictive modeling. Further, on two tasks - recommendation and prediction of gene-disease association - dCMF outperforms state-of-the-art matrix completion algorithms that can utilize auxiliary sources of information.