Clustering above Exponential Families with Tempered Exponential Measures
This work addresses a foundational limitation in clustering theory for machine learning researchers, offering a novel approach to enhance robustness in statistical models.
The paper tackles the problem of extending k-means clustering beyond exponential families to address robustness issues in population minimizers, achieving this through a new generalization called tempered exponential measures (TEM) that maintains a simple analytic form and improves robustness.
The link with exponential families has allowed $k$-means clustering to be generalized to a wide variety of data generating distributions in exponential families and clustering distortions among Bregman divergences. Getting the framework to work above exponential families is important to lift roadblocks like the lack of robustness of some population minimizers carved in their axiomatization. Current generalisations of exponential families like $q$-exponential families or even deformed exponential families fail at achieving the goal. In this paper, we provide a new attempt at getting the complete framework, grounded in a new generalisation of exponential families that we introduce, tempered exponential measures (TEM). TEMs keep the maximum entropy axiomatization framework of $q$-exponential families, but instead of normalizing the measure, normalize a dual called a co-distribution. Numerous interesting properties arise for clustering such as improved and controllable robustness for population minimizers, that keep a simple analytic form.