Constructing Clustering Transformations
This work addresses the need for gradual clustering transformations in data analytics, offering incremental improvements in modeling and theoretical bounds.
The paper tackles the problem of transforming one clustering into another, such as for gradual transitions between solutions, by developing methods based on linear programming and network theory to decompose transformations into elementary moves. It introduces a new metric for clustering distance and provides new bounds on the circuit diameter of partition polytopes.
Clustering is one of the fundamental tasks in data analytics and machine learning. In many situations, different clusterings of the same data set become relevant. For example, different algorithms for the same clustering task may return dramatically different solutions. We are interested in applications in which one clustering has to be transformed into another; e.g., when a gradual transition from an old solution to a new one is required. In this paper, we devise methods for constructing such a transition based on linear programming and network theory. We use a so-called clustering-difference graph to model the desired transformation and provide methods for decomposing the graph into a sequence of elementary moves that accomplishes the transformation. These moves are equivalent to the edge directions, or circuits, of the underlying partition polytopes. Therefore, in addition to a conceptually new metric for measuring the distance between clusterings, we provide new bounds on the circuit diameter of these partition polytopes.