Semi-Supervised Information-Maximization Clustering
This work addresses the need for more effective semi-supervised clustering methods, but it appears incremental as it builds directly on an existing unsupervised approach.
The authors tackled the problem of incorporating prior knowledge into clustering by proposing a semi-supervised clustering algorithm based on information-maximization, which extends an unsupervised method to handle must-links and cannot-links, and they demonstrated its usefulness through experiments.
Semi-supervised clustering aims to introduce prior knowledge in the decision process of a clustering algorithm. In this paper, we propose a novel semi-supervised clustering algorithm based on the information-maximization principle. The proposed method is an extension of a previous unsupervised information-maximization clustering algorithm based on squared-loss mutual information to effectively incorporate must-links and cannot-links. The proposed method is computationally efficient because the clustering solution can be obtained analytically via eigendecomposition. Furthermore, the proposed method allows systematic optimization of tuning parameters such as the kernel width, given the degree of belief in the must-links and cannot-links. The usefulness of the proposed method is demonstrated through experiments.