LG MLNov 4, 2018

Block-wise Partitioning for Extreme Multi-label Classification

Yuefeng Liang, Cho-Jui Hsieh, Thomas C. M. Lee

arXiv:1811.01305v12.92 citations

Originality Incremental advance

AI Analysis

This work addresses the high computational cost in extreme multi-label classification, offering a practical solution for applications with large label sets, though it is incremental as it builds on existing clustering and classification methods.

The paper tackles the computational inefficiency of extreme multi-label classification by proposing a Block-wise Partitioning pretreatment that clusters instances and labels, reducing prediction time while maintaining accuracy on benchmark datasets.

Extreme multi-label classification aims to learn a classifier that annotates an instance with a relevant subset of labels from an extremely large label set. Many existing solutions embed the label matrix to a low-dimensional linear subspace, or examine the relevance of a test instance to every label via a linear scan. In practice, however, those approaches can be computationally exorbitant. To alleviate this drawback, we propose a Block-wise Partitioning (BP) pretreatment that divides all instances into disjoint clusters, to each of which the most frequently tagged label subset is attached. One multi-label classifier is trained on one pair of instance and label clusters, and the label set of a test instance is predicted by first delivering it to the most appropriate instance cluster. Experiments on benchmark multi-label data sets reveal that BP pretreatment significantly reduces prediction time, and retains almost the same level of prediction accuracy.

View on arXiv PDF

Similar