CLMay 6, 2021

Learning to Perturb Word Embeddings for Out-of-distribution QA

arXiv:2105.02692v3713 citations
Originality Incremental advance
AI Analysis

This addresses the issue of poor generalization in QA models for unseen data, though it is incremental as it builds on existing data augmentation techniques.

The paper tackles the problem of QA models failing to generalize to out-of-distribution data by proposing a data augmentation method that perturbs word embeddings without altering semantics, resulting in significant performance improvements over baseline methods across five target domains, including outperforming models trained with over 240K artificially generated QA pairs.

QA models based on pretrained language mod-els have achieved remarkable performance on various benchmark datasets.However, QA models do not generalize well to unseen data that falls outside the training distribution, due to distributional shifts.Data augmentation (DA) techniques which drop/replace words have shown to be effective in regularizing the model from overfitting to the training data.Yet, they may adversely affect the QA tasks since they incur semantic changes that may lead to wrong answers for the QA task. To tackle this problem, we propose a simple yet effective DA method based on a stochastic noise generator, which learns to perturb the word embedding of the input questions and context without changing their semantics. We validate the performance of the QA models trained with our word embedding perturbation on a single source dataset, on five different target domains.The results show that our method significantly outperforms the baselineDA methods. Notably, the model trained with ours outperforms the model trained with more than 240K artificially generated QA pairs.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes