CLApr 28, 2022

Improving robustness of language models from a geometry-aware perspective

arXiv:2204.13309v1640 citationsh-index: 29
Originality Incremental advance
AI Analysis

This work addresses robustness issues in language models for NLP applications, but it is incremental as it builds on existing adversarial training methods.

The paper tackles the trade-off between robustness and accuracy in adversarial training for language models by proposing geometry-aware adversarial training (GAT), which uses friendly adversarial data augmentation to achieve stronger robustness with fewer search steps, as demonstrated across two datasets and three models.

Recent studies have found that removing the norm-bounded projection and increasing search steps in adversarial training can significantly improve robustness. However, we observe that a too large number of search steps can hurt accuracy. We aim to obtain strong robustness efficiently using fewer steps. Through a toy experiment, we find that perturbing the clean data to the decision boundary but not crossing it does not degrade the test accuracy. Inspired by this, we propose friendly adversarial data augmentation (FADA) to generate friendly adversarial data. On top of FADA, we propose geometry-aware adversarial training (GAT) to perform adversarial training on friendly adversarial data so that we can save a large number of search steps. Comprehensive experiments across two widely used datasets and three pre-trained language models demonstrate that GAT can obtain stronger robustness via fewer steps. In addition, we provide extensive empirical results and in-depth analyses on robustness to facilitate future studies.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes