Zero-Shot Multi-Label Topic Inference with Sentence Encoders
This addresses the need for real-time, user-defined topic classification in text-mining applications, but it is incremental as it builds on existing encoder methods.
The paper tackled the problem of zero-shot topic inference using sentence encoders, finding that Sentence-BERT offers superior generality and Universal Sentence Encoder is more efficient across seven datasets.
Sentence encoders have indeed been shown to achieve superior performances for many downstream text-mining tasks and, thus, claimed to be fairly general. Inspired by this, we performed a detailed study on how to leverage these sentence encoders for the "zero-shot topic inference" task, where the topics are defined/provided by the users in real-time. Extensive experiments on seven different datasets demonstrate that Sentence-BERT demonstrates superior generality compared to other encoders, while Universal Sentence Encoder can be preferred when efficiency is a top priority.