Orthogonal Nonnegative Matrix Factorization with Sparsity Constraints
This work addresses a specific optimization challenge in matrix factorization for data analysis, offering incremental improvements in constraint handling and accuracy for applications like feature extraction.
The paper tackles the sparsity-constrained Orthogonal Nonnegative Matrix Factorization (SCONMF) problem by reformulating it as a capacity-constrained facility-location problem, integrating control-barrier functions and maximum-entropy principles to enforce constraints, and introduces a method to determine the true rank of matrices. Simulations show significantly improved factorizations with reconstruction errors reduced by up to 150 times compared to existing methods.
This article presents a novel approach to solving the sparsity-constrained Orthogonal Nonnegative Matrix Factorization (SCONMF) problem, which requires decomposing a non-negative data matrix into the product of two lower-rank non-negative matrices, X=WH, where the mixing matrix H has orthogonal rows HH^T=I, while also satisfying an upper bound on the number of nonzero elements in each row. By reformulating SCONMF as a capacity-constrained facility-location problem (CCFLP), the proposed method naturally integrates non-negativity, orthogonality, and sparsity constraints. Specifically, our approach integrates control-barrier function (CBF) based framework used for dynamic optimal control design problems with maximum-entropy-principle-based framework used for facility location problems to enforce these constraints while ensuring robust factorization. Additionally, this work introduces a quantitative approach for determining the ``true" rank of W or H, equivalent to the number of ``true" features - a critical aspect in ONMF applications where the number of features is unknown. Simulations on various datasets demonstrate significantly improved factorizations with low reconstruction errors (as small as by 150 times) while strictly satisfying all constraints, outperforming existing methods that struggle with balancing accuracy and constraint adherence.