On Evaluation Protocols for Data Augmentation in a Limited Data Scenario
This addresses open questions in data augmentation research for NLP practitioners, revealing limitations of classical methods and highlighting the superiority of conversational agent-based DA.
The paper challenges the effectiveness of classical textual data augmentation in limited data scenarios, showing it merely aids fine-tuning and that its benefits vanish with sufficient fine-tuning time, while demonstrating that zero- and few-shot DA using conversational agents like ChatGPT or LLama2 can improve performance.
Textual data augmentation (DA) is a prolific field of study where novel techniques to create artificial data are regularly proposed, and that has demonstrated great efficiency on small data settings, at least for text classification tasks. In this paper, we challenge those results, showing that classical data augmentation (which modify sentences) is simply a way of performing better fine-tuning, and that spending more time doing so before applying data augmentation negates its effect. This is a significant contribution as it answers several questions that were left open in recent years, namely~: which DA technique performs best (all of them as long as they generate data close enough to the training set, as to not impair training) and why did DA show positive results (facilitates training of network). We further show that zero- and few-shot DA via conversational agents such as ChatGPT or LLama2 can increase performances, confirming that this form of data augmentation is preferable to classical methods.