MLJun 6, 2016

On Robustness of Kernel Clustering

arXiv:1606.01869v39.126 citations

Originality Incremental advance

AI Analysis

This addresses the lack of robustness analysis in kernel clustering methods, which is an incremental theoretical contribution for researchers in unsupervised learning.

The paper tackles the problem of robustness in kernel clustering by introducing a semidefinite programming relaxation and proving that, under a suitable model, both K-SVD and SDP approaches are consistent, with SDP achieving exact recovery and K-SVD having vanishing misclassification errors.

Clustering is one of the most important unsupervised problems in machine learning and statistics. Among many existing algorithms, kernel k-means has drawn much research attention due to its ability to find non-linear cluster boundaries and its inherent simplicity. There are two main approaches for kernel k-means: SVD of the kernel matrix and convex relaxations. Despite the attention kernel clustering has received both from theoretical and applied quarters, not much is known about robustness of the methods. In this paper we first introduce a semidefinite programming relaxation for the kernel clustering problem, then prove that under a suitable model specification, both the K-SVD and SDP approaches are consistent in the limit, albeit SDP is strongly consistent, i.e. achieves exact recovery, whereas K-SVD is weakly consistent, i.e. the fraction of misclassified nodes vanish.

View on arXiv PDF

Similar