A Visual Interpretation-Based Self-Improved Classification System Using Virtual Adversarial Training
This work addresses the problem of improving interpretability and robustness in text classification for researchers and practitioners, but it appears incremental as it builds on existing BERT and VAT methods.
The paper tackles the lack of interpretability and robustness in BERT-based classification systems by proposing a model that combines virtual adversarial training with visual interpretation techniques, achieving effectiveness on a Twitter tweet dataset as demonstrated in experiments and ablation studies.
The successful application of large pre-trained models such as BERT in natural language processing has attracted more attention from researchers. Since the BERT typically acts as an end-to-end black box, classification systems based on it usually have difficulty in interpretation and low robustness. This paper proposes a visual interpretation-based self-improving classification model with a combination of virtual adversarial training (VAT) and BERT models to address the above problems. Specifically, a fine-tuned BERT model is used as a classifier to classify the sentiment of the text. Then, the predicted sentiment classification labels are used as part of the input of another BERT for spam classification via a semi-supervised training manner using VAT. Additionally, visualization techniques, including visualizing the importance of words and normalizing the attention head matrix, are employed to analyze the relevance of each component to classification accuracy. Moreover, brand-new features will be found in the visual analysis, and classification performance will be improved. Experimental results on Twitter's tweet dataset demonstrate the effectiveness of the proposed model on the classification task. Furthermore, the ablation study results illustrate the effect of different components of the proposed model on the classification results.