MLLGApr 17, 2017

Bayesian Hybrid Matrix Factorisation for Data Integration

arXiv:1704.04962v110 citations
Originality Incremental advance
AI Analysis

This work addresses data integration challenges in biological applications, such as drug sensitivity and genomics, by improving prediction accuracy, though it appears incremental as it builds on existing matrix factorisation methods.

The authors tackled the problem of missing value prediction in data integration by introducing a Bayesian hybrid matrix factorisation model, achieving consistently better performance than existing methods on drug sensitivity datasets, especially with increased sparsity, and obtaining the best results on two out of three methylation and gene expression datasets for out-of-matrix predictions.

We introduce a novel Bayesian hybrid matrix factorisation model (HMF) for data integration, based on combining multiple matrix factorisation methods, that can be used for in- and out-of-matrix prediction of missing values. The model is very general and can be used to integrate many datasets across different entity types, including repeated experiments, similarity matrices, and very sparse datasets. We apply our method on two biological applications, and extensively compare it to state-of-the-art machine learning and matrix factorisation models. For in-matrix predictions on drug sensitivity datasets we obtain consistently better performances than existing methods. This is especially the case when we increase the sparsity of the datasets. Furthermore, we perform out-of-matrix predictions on methylation and gene expression datasets, and obtain the best results on two of the three datasets, especially when the predictivity of datasets is high.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes