Sparse and Unique Nonnegative Matrix Factorization Through Data Preprocessing
This addresses the problem of non-uniqueness in NMF for researchers in machine learning, offering a novel approach to improve interpretability, though it is incremental in the context of existing NMF methods.
The paper tackles the ill-posedness of nonnegative matrix factorization (NMF) by introducing a data preprocessing technique based on M-matrices and geometric interpretation, which provably yields sparser and more unique solutions under separability assumptions, with finite exact factorizations for rank-three matrices.
Nonnegative matrix factorization (NMF) has become a very popular technique in machine learning because it automatically extracts meaningful features through a sparse and part-based representation. However, NMF has the drawback of being highly ill-posed, that is, there typically exist many different but equivalent factorizations. In this paper, we introduce a completely new way to obtaining more well-posed NMF problems whose solutions are sparser. Our technique is based on the preprocessing of the nonnegative input data matrix, and relies on the theory of M-matrices and the geometric interpretation of NMF. This approach provably leads to optimal and sparse solutions under the separability assumption of Donoho and Stodden (NIPS, 2003), and, for rank-three matrices, makes the number of exact factorizations finite. We illustrate the effectiveness of our technique on several image datasets.