CVLGMar 21, 2021

Deep Distribution-preserving Incomplete Clustering with Optimal Transport

arXiv:2103.11424v1
Originality Highly original
AI Analysis

This addresses the challenge of clustering incomplete data in real-world applications like computer vision, offering a robust solution for scenarios with missing features.

The paper tackles the problem of clustering incomplete high-dimensional data, where existing methods perform poorly, by proposing DDIC-OT, a deep incomplete clustering method that uses optimal transport for reconstruction and clustering loss for regularization, achieving superior and stable clustering performance improvements over state-of-the-art methods across different missing ratios.

Clustering is a fundamental task in the computer vision and machine learning community. Although various methods have been proposed, the performance of existing approaches drops dramatically when handling incomplete high-dimensional data (which is common in real world applications). To solve the problem, we propose a novel deep incomplete clustering method, named Deep Distribution-preserving Incomplete Clustering with Optimal Transport (DDIC-OT). To avoid insufficient sample utilization in existing methods limited by few fully-observed samples, we propose to measure distribution distance with the optimal transport for reconstruction evaluation instead of traditional pixel-wise loss function. Moreover, the clustering loss of the latent feature is introduced to regularize the embedding with more discrimination capability. As a consequence, the network becomes more robust against missing features and the unified framework which combines clustering and sample imputation enables the two procedures to negotiate to better serve for each other. Extensive experiments demonstrate that the proposed network achieves superior and stable clustering performance improvement against existing state-of-the-art incomplete clustering methods over different missing ratios.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes