CLAIMay 27, 2025

Multi-objective Large Language Model Alignment with Hierarchical Experts

arXiv:2505.20925v111 citationsh-index: 19Has Code
Originality Incremental advance
AI Analysis

This addresses the problem of balancing diverse human preferences in LLM alignment for users needing efficient multi-objective adaptation, though it appears incremental as it builds on existing alignment methods.

The paper tackles the challenge of aligning large language models to multiple conflicting objectives by introducing HoE, a lightweight and parameter-efficient method that eliminates training and adapts across the Pareto frontier, achieving superior performance over 15 baselines on 14 objectives and 200 preferences across 6 benchmarks.

Aligning large language models (LLMs) to simultaneously satisfy multiple objectives remains a significant challenge, especially given the diverse and often conflicting nature of human preferences. Existing alignment methods struggle to balance trade-offs effectively, often requiring costly retraining or yielding suboptimal results across the Pareto frontier of preferences. In this paper, we introduce \textit{HoE}(Hierarchical Mixture-of-Experts), a \textit{lightweight}, \textit{parameter-efficient}, and \textit{plug-and-play} approach that eliminates the need for model training, while enabling LLMs to adapt across the entire Pareto frontier and accommodate diverse user preferences. In particular, \textit{HoE} consists of three hierarchical components: LoRA Experts, Router Experts and Preference Routing, reaching optimal Pareto frontiers and achieving a trade-off between parameter size, training cost, and performance. We evaluate \textit{HoE} across various tasks on 14 objectives and 200 different preferences among 6 benchmarks, demonstrating superior performance over 15 recent baselines. Code is available in the supplementary materials.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes