Lightweight Contenders: Navigating Semi-Supervised Text Mining through Peer Collaboration and Self Transcendence
This work addresses the challenge of cost-effective text mining for applications with scarce annotations, though it appears incremental as it builds on existing SSL and lightweight model techniques.
The paper tackles the problem of semi-supervised learning in lightweight models for text mining, where limited labeled data hinders performance, and introduces PS-NET, which uses online distillation, peer collaboration, and adversarial perturbations to achieve notable performance enhancements over SOTA frameworks like FLiText and DisCo in SSL text classification with extremely rare labeled data.
The semi-supervised learning (SSL) strategy in lightweight models requires reducing annotated samples and facilitating cost-effective inference. However, the constraint on model parameters, imposed by the scarcity of training labels, limits the SSL performance. In this paper, we introduce PS-NET, a novel framework tailored for semi-supervised text mining with lightweight models. PS-NET incorporates online distillation to train lightweight student models by imitating the Teacher model. It also integrates an ensemble of student peers that collaboratively instruct each other. Additionally, PS-NET implements a constant adversarial perturbation schema to further self-augmentation by progressive generalizing. Our PS-NET, equipped with a 2-layer distilled BERT, exhibits notable performance enhancements over SOTA lightweight SSL frameworks of FLiText and DisCo in SSL text classification with extremely rare labelled data.