Preconditioned Test-Time Adaptation for Out-of-Distribution Debiasing in Narrative Generation

Hanwen Shen, Ting Ying, Jiajie Lu, Shanshan Wang

arXiv:2603.1368318.03 citationsh-index: 2

AI Analysis

This addresses the issue of out-of-distribution bias in narrative generation for users of debiased LLMs, representing an incremental improvement with specific gains in adaptation efficiency.

The paper tackles the problem of debiased LLMs failing to generalize to unfamiliar bias prompts, which leads to toxic outputs, by proposing CAP-TTA, a test-time adaptation framework that reduces bias and improves narrative fluency while achieving lower update latency than existing methods.

Although debiased LLMs perform well on known bias patterns, they often fail to generalize to unfamiliar bias prompts, producing toxic outputs. We first validate that such high-bias prompts constitute a \emph{distribution shift} via OOD detection, and show static models degrade under this shift. To adapt on-the-fly, we propose \textbf{CAP-TTA}, a test-time adaptation framework that performs context-aware LoRA updates only when the bias-risk \emph{trigger} exceeds a threshold, using a precomputed diagonal \emph{preconditioner} for fast and stable updates. Across toxic-prompt settings and benchmarks, CAP-TTA reduces bias (confirmed by human evaluation) while achieving much lower update latency than AdamW/SGD; it also mitigates catastrophic forgetting by significantly improving narrative fluency over SOTA debiasing baseline while maintaining comparable debiasing effectiveness.

View on arXiv PDF

Similar