LGSTMLMar 4

K-Means as a Radial Basis function Network: a Variational and Gradient-based Equivalence

arXiv:2603.04625v1
Originality Highly original
AI Analysis

This work provides a foundational theoretical link between discrete clustering and continuous optimization for researchers and practitioners in machine learning, enabling K-Means to be integrated into deep learning architectures.

This paper establishes a theoretical equivalence between the K-Means algorithm and Radial Basis Function (RBF) neural networks, showing that the RBF objective converges to the K-Means solution as a temperature parameter approaches zero. It demonstrates that gradient updates for RBF centers recover K-Means centroid updates in the limit, and proposes Entmax-1.5 for numerical stability.

This work establishes a rigorous variational and gradient-based equivalence between the classical K-Means algorithm and differentiable Radial Basis Function (RBF) neural networks with smooth responsibilities. By reparameterizing the K-Means objective and embedding its distortion functional into a smooth weighted loss, we prove that the RBF objective $Γ$-converges to the K-Means solution as the temperature parameter $σ$ vanishes. We further demonstrate that the gradient-based updates of the RBF centers recover the exact K-Means centroid update rule and induce identical training trajectories in the limit. To address the numerical instability of the Softmax transformation in the low-temperature regime, we propose the integration of Entmax-1.5, which ensures stable polynomial convergence while preserving the underlying Voronoi partition structure. These results bridge the conceptual gap between discrete partitioning and continuous optimization, enabling K-Means to be embedded directly into deep learning architectures for the joint optimization of representations and clusters. Empirical validation across diverse synthetic geometries confirms a monotone collapse of soft RBF centroids toward K-Means fixed points, providing a unified framework for end-to-end differentiable clustering.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes