LGAIDMJul 17, 2025

Soft-ECM: An extension of Evidential C-Means for complex data

arXiv:2507.13417v1h-index: 3FUZZ-IEEE
Originality Incremental advance
AI Analysis

This work addresses a gap in clustering for non-Euclidean data, offering a method for researchers and practitioners dealing with mixed or time series data, though it is incremental as it extends an existing algorithm.

The authors tackled the problem of clustering complex data like mixed numerical-categorical data and time series, which existing belief function-based clustering algorithms cannot handle, by proposing Soft-ECM, an extension of Evidential C-Means that works with semi-metrics. Their experiments show Soft-ECM achieves results comparable to fuzzy clustering on numerical data and effectively handles mixed data and time series using metrics like DTW.

Clustering based on belief functions has been gaining increasing attention in the machine learning community due to its ability to effectively represent uncertainty and/or imprecision. However, none of the existing algorithms can be applied to complex data, such as mixed data (numerical and categorical) or non-tabular data like time series. Indeed, these types of data are, in general, not represented in a Euclidean space and the aforementioned algorithms make use of the properties of such spaces, in particular for the construction of barycenters. In this paper, we reformulate the Evidential C-Means (ECM) problem for clustering complex data. We propose a new algorithm, Soft-ECM, which consistently positions the centroids of imprecise clusters requiring only a semi-metric. Our experiments show that Soft-ECM present results comparable to conventional fuzzy clustering approaches on numerical data, and we demonstrate its ability to handle mixed data and its benefits when combining fuzzy clustering with semi-metrics such as DTW for time series data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes