Optimizing Classification of Infrequent Labels by Reducing Variability in Label Distribution
This addresses a specific bottleneck in Extreme Classification for researchers and practitioners dealing with imbalanced datasets, though it appears incremental as it builds on existing methods like Siamese architectures.
The paper tackles the problem of poor classification performance for infrequent labels in Extreme Classification tasks by reducing label inconsistency, resulting in substantial improvements in handling these categories and setting a new benchmark.
This paper presents a novel solution, LEVER, designed to address the challenges posed by underperforming infrequent categories in Extreme Classification (XC) tasks. Infrequent categories, often characterized by sparse samples, suffer from high label inconsistency, which undermines classification performance. LEVER mitigates this problem by adopting a robust Siamese-style architecture, leveraging knowledge transfer to reduce label inconsistency and enhance the performance of One-vs-All classifiers. Comprehensive testing across multiple XC datasets reveals substantial improvements in the handling of infrequent categories, setting a new benchmark for the field. Additionally, the paper introduces two newly created multi-intent datasets, offering essential resources for future XC research.