LGMay 11

Controllability in preference-conditioned multi-objective reinforcement learning

Pau de las Heras Molins, Beyazit Yalcinkaya, Lasse Peters, David Fridovich-Keil, Georgios Bakirtzis

arXiv:2605.105856.5

Predicted impact top 44% in LG · last 90 daysOriginality Synthesis-oriented

AI Analysis

For researchers in multi-objective reinforcement learning, the paper highlights a critical gap in evaluation protocols that undermines the reliability of preference-based agent control.

The paper identifies that standard multi-objective reinforcement learning (MORL) metrics fail to measure whether preference-conditioned agents actually respond to changes in preference (controllability), and proposes a complementary metric to assess this property.

Multi-objective reinforcement learning (MORL) allows a user to express preference over outcomes in terms of the relative importance of the objectives, but standard metrics cannot capture whether changes in preference reliably change the agent's behavior in the intended way, a property termed controllability. As a result, preference-conditioned agents can score well on standard MORL metrics while being insensitive to the preference input. If the ability to control agents cannot be reliably assessed, the symbolic interface that MORL provides between user intent and agent behavior is broken. Mainstream MORL metrics alone fail to measure the controllability of preference-conditioned agents, motivating a complementary metric specifically designed to that end. We hope the results spur discussion in the community on existing evaluation protocols to consolidate advances in preference adaptation in MORL to larger and more complex problems.

View on arXiv PDF

Similar