MLDSLGAug 27, 2019

Statistical and Computational Trade-Offs in Kernel K-Means

arXiv:1908.10284v134 citations
AI Analysis

This addresses the computational bottleneck in unsupervised learning for practitioners needing scalable kernel methods, though it is incremental as it builds on existing Nyström techniques.

The paper tackles the computational inefficiency of exact kernel k-means by proposing a Nyström-based method that reduces computational costs without loss of accuracy, achieving the same accuracy with only a fraction of computations, specifically by sampling √n landmarks.

We investigate the efficiency of k-means in terms of both statistical and computational requirements. More precisely, we study a Nyström approach to kernel k-means. We analyze the statistical properties of the proposed method and show that it achieves the same accuracy of exact kernel k-means with only a fraction of computations. Indeed, we prove under basic assumptions that sampling $\sqrt{n}$ Nyström landmarks allows to greatly reduce computational costs without incurring in any loss of accuracy. To the best of our knowledge this is the first result of this kind for unsupervised learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes