HCMay 5

Stayin' Aligned Over Time: Towards Longitudinal Human-LLM Alignment via Contextual Reflection and Privacy-Preserving Behavioral Data

arXiv:2605.0402985.8
AI Analysis

For researchers and practitioners developing LLM alignment methods, this work highlights the limitations of single-moment preference datasets and advocates for longitudinal evaluation, though the study is small-scale and preliminary.

The authors argue that current human-LLM alignment methods treat preferences as static, ignoring how they evolve over time with real-world consequences. They propose a longitudinal framework and instantiate it with BITE, a browser system that captures in-situ preferences, triggers follow-up reflections, and collects privacy-preserving behavioral data; a two-week study with 8 participants revealed differences between immediate and later preferences in accuracy and relevance.

Current human-AI alignment and evaluation methods for large language models (LLMs) often rely on preference signals collected immediately after an interaction. This practice implicitly treats preference as static, even though many LLM-mediated decisions unfold over time and may be re-evaluated differently after real-world consequences and observed outcomes. Therefore, we argue for a methodological shift from single-moment preference elicitation to longitudinal, context-situated alignment measurement. We present a methodological framework for collecting temporally grounded alignment signals by combining (1) in-situ preference capture, (2) context-triggered follow-up preference reflection, and (3) privacy-preserving behavioral traces that help interpret preference change. As an instantiation of this methodology, we introduce BITE, a browser-based system that detects consequential LLM interactions, prompts reflection across later decision points, and supports progressive, user-controlled consent for sharing behavioral data. Through a two week longitudinal deployment study with 8 participants, our approach surfaced differences between immediate and later user preferences in accuracy, relevance and other dimensions of the LLM output. Our findings highlight the limitations of single-moment preference datasets and underscore the importance of longitudinal methods for alignment evaluation in everyday use.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes