Stayin' Aligned Over Time: Towards Longitudinal Human-LLM Alignment via Contextual Reflection and Privacy-Preserving Behavioral Data
For researchers and practitioners developing LLM alignment methods, this work highlights the limitations of single-moment preference datasets and advocates for longitudinal evaluation, though the study is small-scale and preliminary.
The authors argue that current human-LLM alignment methods treat preferences as static, ignoring how they evolve over time with real-world consequences. They propose a longitudinal framework and instantiate it with BITE, a browser system that captures in-situ preferences, triggers follow-up reflections, and collects privacy-preserving behavioral data; a two-week study with 8 participants revealed differences between immediate and later preferences in accuracy and relevance.
Current human-AI alignment and evaluation methods for large language models (LLMs) often rely on preference signals collected immediately after an interaction. This practice implicitly treats preference as static, even though many LLM-mediated decisions unfold over time and may be re-evaluated differently after real-world consequences and observed outcomes. Therefore, we argue for a methodological shift from single-moment preference elicitation to longitudinal, context-situated alignment measurement. We present a methodological framework for collecting temporally grounded alignment signals by combining (1) in-situ preference capture, (2) context-triggered follow-up preference reflection, and (3) privacy-preserving behavioral traces that help interpret preference change. As an instantiation of this methodology, we introduce BITE, a browser-based system that detects consequential LLM interactions, prompts reflection across later decision points, and supports progressive, user-controlled consent for sharing behavioral data. Through a two week longitudinal deployment study with 8 participants, our approach surfaced differences between immediate and later user preferences in accuracy, relevance and other dimensions of the LLM output. Our findings highlight the limitations of single-moment preference datasets and underscore the importance of longitudinal methods for alignment evaluation in everyday use.