CLAIDec 29, 2024

Controlling Out-of-Domain Gaps in LLMs for Genre Classification and Generated Text Detection

arXiv:2412.20595v119 citationsh-index: 15COLING
Originality Incremental advance
AI Analysis

This addresses domain transfer issues in LLMs for non-topical classification tasks, though it is incremental as it builds on prior OOD research.

The study tackled the out-of-domain performance gap in Large Language Models for genre classification and generated text detection by introducing a method that controls predictive indicators to focus on stylistic features, reducing the gap by up to 20 percentage points.

This study demonstrates that the modern generation of Large Language Models (LLMs, such as GPT-4) suffers from the same out-of-domain (OOD) performance gap observed in prior research on pre-trained Language Models (PLMs, such as BERT). We demonstrate this across two non-topical classification tasks: 1) genre classification and 2) generated text detection. Our results show that when demonstration examples for In-Context Learning (ICL) come from one domain (e.g., travel) and the system is tested on another domain (e.g., history), classification performance declines significantly. To address this, we introduce a method that controls which predictive indicators are used and which are excluded during classification. For the two tasks studied here, this ensures that topical features are omitted, while the model is guided to focus on stylistic rather than content-based attributes. This approach reduces the OOD gap by up to 20 percentage points in a few-shot setup. Straightforward Chain-of-Thought (CoT) methods, used as the baseline, prove insufficient, while our approach consistently enhances domain transfer performance.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes