LGMLApr 17, 2020

Incorporating Multiple Cluster Centers for Multi-Label Learning

arXiv:2004.08113v38 citations
AI Analysis

This work addresses multi-label classification for applications where instances have multiple labels, offering an incremental improvement by integrating data augmentation and clustering techniques.

The paper tackles the problem of multi-label learning by proposing a data augmentation approach using cluster centers as virtual examples to capture local label correlations, and introduces a regularization term to align real and virtual examples, resulting in improved performance over state-of-the-art methods on real-world datasets.

Multi-label learning deals with the problem that each instance is associated with multiple labels simultaneously. Most of the existing approaches aim to improve the performance of multi-label learning by exploiting label correlations. Although the data augmentation technique is widely used in many machine learning tasks, it is still unclear whether data augmentation is helpful to multi-label learning. In this article, we propose to leverage the data augmentation technique to improve the performance of multi-label learning. Specifically, we first propose a novel data augmentation approach that performs clustering on the real examples and treats the cluster centers as virtual examples, and these virtual examples naturally embody the local label correlations and label importances. Then, motivated by the cluster assumption that examples in the same cluster should have the same label, we propose a novel regularization term to bridge the gap between the real examples and virtual examples, which can promote the local smoothness of the learning function. Extensive experimental results on a number of real-world multi-label datasets clearly demonstrate that our proposed approach outperforms the state-of-the-art counterparts.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes