AICLJun 1

TriAlign: Towards Universal Truth Consistency in Personalized LLM Alignment

arXiv:2606.0175583.2
Predicted impact top 29% in AI · last 90 daysOriginality Incremental advance
AI Analysis

This work addresses fairness and truth consistency in personalized LLMs for diverse social groups, a critical but underexplored problem in AI alignment.

TriAlign introduces a multi-agent reinforcement learning framework to ensure that personalized LLMs maintain consistent universal truth accuracy across social groups, reducing disparities while preserving personalization. Experiments show it outperforms baselines in balancing truth accuracy, cross-group consistency, and personalization.

Personalized large language models adapt responses to users' preferences and social attributes, but can introduce substantial universal truth inconsistencies across social groups, where some groups systematically receive less accurate responses on objective tasks. Existing alignment methods either ignore personalization or mainly focus on subjective preference alignment, largely overlooking fairness and consistency in universal truths. To address this gap, we study Truth-Invariant Alignment (TIA), an alignment problem for personalized LLMs that aims to ensure universal truths remain consistent across social groups while preserving personalization. We propose TriAlign, the first offline multi-agent reinforcement learning (MARL) framework for TIA, where each social group is modeled as an agent interacting. TriAlign jointly optimizes universal truth accuracy, cross-group truth consistency, and personalization through a fairness-aware objective and an explicit inconsistency penalty. Experiments across diverse benchmarks demonstrate that TriAlign achieves a stronger balance among these three objectives than strong baselines, reducing universal truth disparities across social groups while improving both objective task performance and personalization quality.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes