AICLJul 4, 2024

Orchestrating LLMs with Different Personalizations

arXiv:2407.04181v15 citationsh-index: 80
Originality Incremental advance
AI Analysis

This addresses the problem of personalizing LLMs for users based on multi-dimensional preferences, offering a scalable and efficient incremental improvement over fine-tuning methods.

The paper tackles aligning large language models with individual human preferences without retraining, by merging outputs of specialized expert models using a lightweight Preference Control Model, achieving results that match or surpass existing preference merging techniques.

This paper presents a novel approach to aligning large language models (LLMs) with individual human preferences, sometimes referred to as Reinforcement Learning from \textit{Personalized} Human Feedback (RLPHF). Given stated preferences along multiple dimensions, such as helpfulness, conciseness, or humor, the goal is to create an LLM without re-training that best adheres to this specification. Starting from specialized expert LLMs, each trained for one such particular preference dimension, we propose a black-box method that merges their outputs on a per-token level. We train a lightweight Preference Control Model (PCM) that dynamically translates the preference description and current context into next-token prediction weights. By combining the expert models' outputs at the token level, our approach dynamically generates text that optimizes the given preference. Empirical tests show that our method matches or surpasses existing preference merging techniques, providing a scalable, efficient alternative to fine-tuning LLMs for individual personalization.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes