MLDec 23, 2015

k-Means Clustering Is Matrix Factorization

arXiv:1512.07548v163 citations
Originality Synthesis-oriented
AI Analysis

This clarifies a foundational connection in machine learning for researchers and practitioners, but it is incremental as it makes an existing implicit result explicit without introducing new methods or data.

The paper demonstrates that the objective function of k-means clustering can be reformulated as a matrix factorization problem, specifically as the Frobenius norm of a data matrix difference, providing an explicit derivation for a result often referenced but not detailed in prior literature.

We show that the objective function of conventional k-means clustering can be expressed as the Frobenius norm of the difference of a data matrix and a low rank approximation of that data matrix. In short, we show that k-means clustering is a matrix factorization problem. These notes are meant as a reference and intended to provide a guided tour towards a result that is often mentioned but seldom made explicit in the literature.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes