CL AIOct 17, 2025

POPI: Personalizing LLMs via Optimized Natural Language Preference Inference

Yizhuo Chen, Xin Liu, Ruijie Wang, Zheng Li, Pei Chen, Changlong Yu, Priyanka Nigam, Meng Jiang, Bing Yin

arXiv:2510.17881v14.91 citationsh-index: 5

Originality Incremental advance

AI Analysis

This addresses the need for efficient and effective personalization in LLMs for users, though it is incremental as it builds on existing alignment techniques.

The paper tackles the problem of inconsistent user experiences with LLMs due to diverse preferences by proposing POPI, a framework that distills user signals into natural language summaries for personalization, resulting in improved personalization accuracy and reduced context overhead across four benchmarks.

Large language models (LLMs) achieve strong benchmark performance, yet user experiences remain inconsistent due to diverse preferences in style, tone, and reasoning mode. Nevertheless, existing alignment techniques such as reinforcement learning from human feedback (RLHF) or Direct Preference Optimization (DPO) largely optimize toward population-level averages and overlook individual variation. Naive personalization strategies like per-user fine-tuning are computationally prohibitive, and in-context approaches that prepend raw user signals often suffer from inefficiency and noise. To address these challenges, we propose POPI, a general framework that introduces a preference inference model to distill heterogeneous user signals into concise natural language summaries. These summaries act as transparent, compact, and transferable personalization representations that condition a shared generation model to produce personalized responses. POPI jointly optimizes both preference inference and personalized generation under a unified objective using reinforcement learning, ensuring summaries maximally encode useful preference information. Extensive experiments across four personalization benchmarks demonstrate that POPI consistently improves personalization accuracy while reducing context overhead by a large margin. Moreover, optimized summaries seamlessly transfer to frozen off-the-shelf LLMs, enabling plug-and-play personalization without weight updates.

View on arXiv PDF

Similar