LGMLJan 13, 2016

Online Prediction of Dyadic Data with Heterogeneous Matrix Factorization

arXiv:1601.03124v1
Originality Highly original
AI Analysis

This addresses the large-scale dyadic data prediction problem for applications such as recommendation systems, with incremental improvements in accuracy and robustness over existing methods.

The paper tackled the problem of predicting unobserved dyadic data by developing a Heterogeneous Matrix Factorization (HeMF) model that integrates discrete and continuous latent factor approaches, achieving state-of-the-art performance on collaborative filtering datasets like EachMovie, MovieLens, and Netflix Prize.

Dyadic Data Prediction (DDP) is an important problem in many research areas. This paper develops a novel fully Bayesian nonparametric framework which integrates two popular and complementary approaches, discrete mixed membership modeling and continuous latent factor modeling into a unified Heterogeneous Matrix Factorization~(HeMF) model, which can predict the unobserved dyadics accurately. The HeMF can determine the number of communities automatically and exploit the latent linear structure for each bicluster efficiently. We propose a Variational Bayesian method to estimate the parameters and missing data. We further develop a novel online learning approach for Variational inference and use it for the online learning of HeMF, which can efficiently cope with the important large-scale DDP problem. We evaluate the performance of our method on the EachMoive, MovieLens and Netflix Prize collaborative filtering datasets. The experiment shows that, our model outperforms state-of-the-art methods on all benchmarks. Compared with Stochastic Gradient Method (SGD), our online learning approach achieves significant improvement on the estimation accuracy and robustness.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes