CLLGDec 1, 2024

Lightweight Contenders: Navigating Semi-Supervised Text Mining through Peer Collaboration and Self Transcendence

arXiv:2412.00883v111 citationsh-index: 64NAACL
Originality Incremental advance
AI Analysis

This work addresses the challenge of cost-effective text mining for applications with scarce annotations, though it appears incremental as it builds on existing SSL and lightweight model techniques.

The paper tackles the problem of semi-supervised learning in lightweight models for text mining, where limited labeled data hinders performance, and introduces PS-NET, which uses online distillation, peer collaboration, and adversarial perturbations to achieve notable performance enhancements over SOTA frameworks like FLiText and DisCo in SSL text classification with extremely rare labeled data.

The semi-supervised learning (SSL) strategy in lightweight models requires reducing annotated samples and facilitating cost-effective inference. However, the constraint on model parameters, imposed by the scarcity of training labels, limits the SSL performance. In this paper, we introduce PS-NET, a novel framework tailored for semi-supervised text mining with lightweight models. PS-NET incorporates online distillation to train lightweight student models by imitating the Teacher model. It also integrates an ensemble of student peers that collaboratively instruct each other. Additionally, PS-NET implements a constant adversarial perturbation schema to further self-augmentation by progressive generalizing. Our PS-NET, equipped with a 2-layer distilled BERT, exhibits notable performance enhancements over SOTA lightweight SSL frameworks of FLiText and DisCo in SSL text classification with extremely rare labelled data.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes