CLNov 14, 2022

Efficient Adversarial Training with Robust Early-Bird Tickets

arXiv:2211.07263v3292 citationsh-index: 70
Originality Incremental advance
AI Analysis

This work addresses efficiency issues in adversarial training for NLP practitioners, offering a significant speedup with minimal robustness trade-offs, though it is incremental as it builds on existing adversarial training and pruning techniques.

The paper tackles the high computational cost of adversarial training for pre-trained language models by identifying robust subnetworks early in training, achieving up to 7x to 13x speedups while maintaining or improving robustness compared to state-of-the-art methods.

Adversarial training is one of the most powerful methods to improve the robustness of pre-trained language models (PLMs). However, this approach is typically more expensive than traditional fine-tuning because of the necessity to generate adversarial examples via gradient descent. Delving into the optimization process of adversarial training, we find that robust connectivity patterns emerge in the early training phase (typically $0.15\sim0.3$ epochs), far before parameters converge. Inspired by this finding, we dig out robust early-bird tickets (i.e., subnetworks) to develop an efficient adversarial training method: (1) searching for robust tickets with structured sparsity in the early stage; (2) fine-tuning robust tickets in the remaining time. To extract the robust tickets as early as possible, we design a ticket convergence metric to automatically terminate the searching process. Experiments show that the proposed efficient adversarial training method can achieve up to $7\times \sim 13 \times$ training speedups while maintaining comparable or even better robustness compared to the most competitive state-of-the-art adversarial training methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes