MLLGNAGNQMMEFeb 20, 2017

Bayesian Boolean Matrix Factorisation

arXiv:1702.06166v236 citations
Originality Incremental advance
AI Analysis

This provides a novel method for Boolean matrix factorization with full posterior inference, improving interpretability and controlling false positives in applications like collaborative filtering, though it is incremental in the context of existing factorization techniques.

The authors tackled the problem of Boolean matrix factorization by introducing the OrMachine, a probabilistic generative model with a Metropolised Gibbs sampler for efficient parallel posterior inference. Their method outperformed all existing approaches on real-world and simulated data, scaling to large datasets like 1.3 million mouse brain cells across 11,000 genes on commodity hardware.

Boolean matrix factorisation aims to decompose a binary data matrix into an approximate Boolean product of two low rank, binary matrices: one containing meaningful patterns, the other quantifying how the observations can be expressed as a combination of these patterns. We introduce the OrMachine, a probabilistic generative model for Boolean matrix factorisation and derive a Metropolised Gibbs sampler that facilitates efficient parallel posterior inference. On real world and simulated data, our method outperforms all currently existing approaches for Boolean matrix factorisation and completion. This is the first method to provide full posterior inference for Boolean Matrix factorisation which is relevant in applications, e.g. for controlling false positive rates in collaborative filtering and, crucially, improves the interpretability of the inferred patterns. The proposed algorithm scales to large datasets as we demonstrate by analysing single cell gene expression data in 1.3 million mouse brain cells across 11 thousand genes on commodity hardware.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes