CLAILGJan 3, 2024

Iterative Mask Filling: An Effective Text Augmentation Method Using Masked Language Modeling

arXiv:2401.01830v111 citationsh-index: 16
Originality Incremental advance
AI Analysis

This addresses the problem of data scarcity in NLP for researchers and practitioners, though it appears incremental as it builds on existing masked language modeling techniques.

The paper tackles the limited exploration of data augmentation in NLP by proposing an iterative mask filling method using BERT's Fill-Mask feature, which significantly improves performance, particularly on topic classification datasets.

Data augmentation is an effective technique for improving the performance of machine learning models. However, it has not been explored as extensively in natural language processing (NLP) as it has in computer vision. In this paper, we propose a novel text augmentation method that leverages the Fill-Mask feature of the transformer-based BERT model. Our method involves iteratively masking words in a sentence and replacing them with language model predictions. We have tested our proposed method on various NLP tasks and found it to be effective in many cases. Our results are presented along with a comparison to existing augmentation methods. Experimental results show that our proposed method significantly improves performance, especially on topic classification datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes