CL IR LGJul 31, 2021

ECLARE: Extreme Classification with Label Graph Correlations

Anshul Mittal, Noveen Sachdeva, Sheshansh Agrawal, Sumeet Agarwal, Purushottam Kar, Manik Varma

arXiv:2108.00261v171 citationsHas Code

Originality Incremental advance

AI Analysis

It improves personalized recommendations in large-scale systems like search engines, though it is incremental over existing methods.

The paper tackles the problem of predicting rare labels in deep extreme classification by incorporating label text and correlations, achieving 2 to 14% more accurate predictions on benchmark and proprietary datasets.

Deep extreme classification (XC) seeks to train deep architectures that can tag a data point with its most relevant subset of labels from an extremely large label set. The core utility of XC comes from predicting labels that are rarely seen during training. Such rare labels hold the key to personalized recommendations that can delight and surprise a user. However, the large number of rare labels and small amount of training data per rare label offer significant statistical and computational challenges. State-of-the-art deep XC methods attempt to remedy this by incorporating textual descriptions of labels but do not adequately address the problem. This paper presents ECLARE, a scalable deep learning architecture that incorporates not only label text, but also label correlations, to offer accurate real-time predictions within a few milliseconds. Core contributions of ECLARE include a frugal architecture and scalable techniques to train deep models along with label correlation graphs at the scale of millions of labels. In particular, ECLARE offers predictions that are 2 to 14% more accurate on both publicly available benchmark datasets as well as proprietary datasets for a related products recommendation task sourced from the Bing search engine. Code for ECLARE is available at https://github.com/Extreme-classification/ECLARE.

View on arXiv PDF Code

Similar