XChoice: Explainable Evaluation of AI-Human Alignment in LLM-based Constrained Choice Decision Making
This addresses the need for explainable evaluation of AI alignment in decision-making tasks, particularly for researchers and practitioners in AI ethics and fairness, though it is incremental as it builds on existing alignment concepts with a new methodological focus.
The authors tackled the problem of evaluating AI-human alignment in constrained decision making by introducing XChoice, a framework that recovers interpretable parameters from human and LLM decisions to assess alignment beyond surface metrics. They demonstrated it on time allocation data, revealing heterogeneous alignment and misalignment in specific demographic groups, with validation through robustness analysis and a RAG intervention.
We present XChoice, an explainable framework for evaluating AI-human alignment in constrained decision making. Moving beyond outcome agreement such as accuracy and F1 score, XChoice fits a mechanism-based decision model to human data and LLM-generated decisions, recovering interpretable parameters that capture the relative importance of decision factors, constraint sensitivity, and implied trade-offs. Alignment is assessed by comparing these parameter vectors across models, options, and subgroups. We demonstrate XChoice on Americans' daily time allocation using the American Time Use Survey (ATUS) as human ground truth, revealing heterogeneous alignment across models and activities and salient misalignment concentrated in Black and married groups. We further validate robustness of XChoice via an invariance analysis and evaluate targeted mitigation with a retrieval augmented generation (RAG) intervention. Overall, XChoice provides mechanism-based metrics that diagnose misalignment and support informed improvements beyond surface outcome matching.