CL AIOct 13, 2022

LIME: Weakly-Supervised Text Classification Without Seeds

arXiv:2210.06720v131.0584 citationsh-index: 8Has Code

Originality Highly original

AI Analysis

It addresses the problem of brittle seed-word dependency in weakly-supervised text classification for NLP practitioners, offering a streamlined approach.

The paper tackles weakly-supervised text classification by replacing seed-word generation with entailment-based pseudo-classification, achieving state-of-the-art results on 4 benchmarks.

In weakly-supervised text classification, only label names act as sources of supervision. Predominant approaches to weakly-supervised text classification utilize a two-phase framework, where test samples are first assigned pseudo-labels and are then used to train a neural text classifier. In most previous work, the pseudo-labeling step is dependent on obtaining seed words that best capture the relevance of each class label. We present LIME, a framework for weakly-supervised text classification that entirely replaces the brittle seed-word generation process with entailment-based pseudo-classification. We find that combining weakly-supervised classification and textual entailment mitigates shortcomings of both, resulting in a more streamlined and effective classification pipeline. With just an off-the-shelf textual entailment model, LIME outperforms recent baselines in weakly-supervised text classification and achieves state-of-the-art in 4 benchmarks. We open source our code at https://github.com/seongminp/LIME.

View on arXiv PDF Code

Similar