LGAICLApr 20, 2025

LoRe: Personalizing LLMs via Low-Rank Reward Modeling

arXiv:2504.14439v121 citationsh-index: 8
Originality Highly original
AI Analysis

This work addresses the challenge of adapting LLMs to individual user preferences, which is incremental by building on RLHF with a more efficient and scalable method.

The paper tackled the problem of personalizing large language models to diverse user preferences by introducing a low-rank reward modeling framework, which demonstrated superior generalization to unseen users and improved accuracy in preference prediction tasks on multiple datasets.

Personalizing large language models (LLMs) to accommodate diverse user preferences is essential for enhancing alignment and user satisfaction. Traditional reinforcement learning from human feedback (RLHF) approaches often rely on monolithic value representations, limiting their ability to adapt to individual preferences. We introduce a novel framework that leverages low-rank preference modeling to efficiently learn and generalize user-specific reward functions. By representing reward functions in a low-dimensional subspace and modeling individual preferences as weighted combinations of shared basis functions, our approach avoids rigid user categorization while enabling scalability and few-shot adaptation. We validate our method on multiple preference datasets, demonstrating superior generalization to unseen users and improved accuracy in preference prediction tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes