Efficiently Factorizing Boolean Matrices using Proximal Gradient Descent
This work addresses interpretability and efficiency issues in matrix factorization for Boolean data, particularly useful in domains like medicine, though it is incremental as it builds on existing BMF methods.
The paper tackles the high computational cost of Boolean Matrix Factorization (BMF) by proposing a continuous relaxation with an elastic-binary regularizer and a proximal gradient algorithm, resulting in improved recall, loss, and runtime on real-world data compared to state-of-the-art methods.
Addressing the interpretability problem of NMF on Boolean data, Boolean Matrix Factorization (BMF) uses Boolean algebra to decompose the input into low-rank Boolean factor matrices. These matrices are highly interpretable and very useful in practice, but they come at the high computational cost of solving an NP-hard combinatorial optimization problem. To reduce the computational burden, we propose to relax BMF continuously using a novel elastic-binary regularizer, from which we derive a proximal gradient algorithm. Through an extensive set of experiments, we demonstrate that our method works well in practice: On synthetic data, we show that it converges quickly, recovers the ground truth precisely, and estimates the simulated rank exactly. On real-world data, we improve upon the state of the art in recall, loss, and runtime, and a case study from the medical domain confirms that our results are easily interpretable and semantically meaningful.