Addressing the Ecological Fallacy in Larger LMs with Human Context

arXiv:2603.05928v1h-index: 7
Predicted impact top 26% in CL · last 90 daysOriginality Incremental advance
AI Analysis

This work addresses the problem of improving language model performance by incorporating author context, which is relevant for researchers and practitioners working with large language models, particularly those dealing with human-generated text. The gains are incremental but demonstrate the importance of author context.

This paper investigates whether modeling author-specific language context, a concept called the ecological fallacy, can improve the performance of larger language models. They found that fine-tuning an 8B Llama model with author context using QLoRA improved performance over standard fine-tuning, and continued pre-training with a human-aware objective led to a model that generalized better across eight downstream tasks.

Language model training and inference ignore a fundamental linguistic fact -- there is a dependence between multiple sequences of text written by the same person. Prior work has shown that addressing this form of \textit{ecological fallacy} can greatly improve the performance of multiple smaller (~124M) GPT-based models. In this work, we ask if addressing the ecological fallacy by modeling the author's language context with a specific LM task (called HuLM) can provide similar benefits for a larger-scale model, an 8B Llama model. To this end, we explore variants that process an author's language in the context of their other temporally ordered texts. We study the effect of pre-training with this author context using the HuLM objective, as well as using it during fine-tuning with author context (\textit{HuFT:Human-aware Fine-Tuning}). Empirical comparisons show that addressing the ecological fallacy during fine-tuning alone using QLoRA improves the performance of the larger 8B model over standard fine-tuning. Additionally, QLoRA-based continued HuLM pre-training results in a human-aware model generalizable for improved performance over eight downstream tasks with linear task classifier training alone. These results indicate the utility and importance of modeling language in the context of its original generators, the authors.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes