CLAIJun 27, 2023

MAT: Mixed-Strategy Game of Adversarial Training in Fine-tuning

arXiv:2306.15826v17 citationsh-index: 11
Originality Highly original
AI Analysis

This work addresses the problem of enhancing model robustness and generalization for NLP practitioners, representing an incremental advancement over existing adversarial training methods.

The paper tackles the limitation of pure-strategy adversarial training in fine-tuning pre-trained language models by proposing a mixed-strategy game approach, achieving state-of-the-art performance on GLUE and ANLI benchmarks with significant improvements in generalization and robustness.

Fine-tuning large-scale pre-trained language models has been demonstrated effective for various natural language processing (NLP) tasks. Previous studies have established that incorporating adversarial training during the fine-tuning stage can significantly enhance model generalization and robustness. However, from the perspective of game theory, such utilizations of adversarial training correspond to pure-strategy games, which are inherently limited in terms of the scope of their strategies, thereby still having room for improvement. In order to push the performance boundaries, we propose a novel Mixed-strategy Adversarial Training algorithm (MAT). Methodologically, we derive the Nash equilibrium of a mixed-strategy game for adversarial training using Entropy Mirror Descent to establish MAT by sampling method. To verify the effectiveness of MAT, we conducted extensive benchmark experiments on large-scale pre-trained models, such as BERT and RoBERTa. MAT significantly outperforms the state-of-the-art methods on both the GLUE and ANLI benchmarks in terms of generalization and robustness.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes