CLDec 9, 2024

JAPAGEN: Efficient Few/Zero-shot Learning via Japanese Training Dataset Generation with LLM

arXiv:2412.06738v12 citationsh-index: 21PACLIC
Originality Incremental advance
AI Analysis

This addresses the need for efficient data generation in non-English languages, offering a domain-specific solution for Japanese NLP tasks, though it is incremental as it extends existing English-focused methods.

The paper tackled the problem of generating training data for Japanese language tasks using Large Language Models (LLMs) in few-shot and zero-shot scenarios, resulting in JAPAGEN, which achieved competitive performance compared to conventional LLM prompting strategies on six diverse Japanese downstream tasks.

Recently some studies have highlighted the potential of Large Language Models (LLMs) as effective generators of supervised training data, offering advantages such as enhanced inference efficiency and reduced costs associated with data collection. However, these studies have predominantly focused on English language tasks. In this paper, we address the fundamental research question: Can LLMs serve as proficient training data generators for other language tasks? Specifically, we leverage LLMs to synthesize supervised training data under few-shot and zero-shot learning scenarios across six diverse Japanese downstream tasks. Subsequently, we utilize this synthesized data to train compact models (e.g., BERT). This novel methodology is termed JAPAGEN. Our experimental findings underscore that JAPAGEN achieves robust performance in classification tasks that necessitate formal text inputs, demonstrating competitive results compared to conventional LLM prompting strategies.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes