CLAug 8, 2023

I-WAS: a Data Augmentation Method with GPT-2 for Simile Detection

arXiv:2308.04109v11 citationsh-index: 19
Originality Incremental advance
AI Analysis

This addresses data scarcity for simile detection in literature and NLP applications, but it is incremental as it builds on existing augmentation techniques.

The authors tackled the problem of limited and unrepresentative corpora for simile detection in NLP by proposing I-WAS, a data augmentation method using GPT-2 for word replacement and sentence completion, which improved performance on a diverse corpus.

Simile detection is a valuable task for many natural language processing (NLP)-based applications, particularly in the field of literature. However, existing research on simile detection often relies on corpora that are limited in size and do not adequately represent the full range of simile forms. To address this issue, we propose a simile data augmentation method based on \textbf{W}ord replacement And Sentence completion using the GPT-2 language model. Our iterative process called I-WAS, is designed to improve the quality of the augmented sentences. To better evaluate the performance of our method in real-world applications, we have compiled a corpus containing a more diverse set of simile forms for experimentation. Our experimental results demonstrate the effectiveness of our proposed data augmentation method for simile detection.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes