HCMay 7

UX in the Age of AI: Rethinking Evaluation Metrics Through a Statistical Lens

arXiv:2605.0560041.3h-index: 29

AI Analysis

For UX practitioners and researchers evaluating AI products, this addresses the gap of outdated metrics, though the validation is only conceptual.

The paper identifies that classical UX metrics (SUS, NPS, task completion rate) are insufficient for AI-mediated systems due to stochastic outputs. It introduces ADUX-Stat, a framework with three novel constructs (IEI, TDC, BUCS) that models usability as a probabilistic distribution, validated conceptually against five AI product categories.

The rapid proliferation of artificial intelligence (AI) in consumer-facing digital products has disrupted the assumptions underlying classical user experience (UX) evaluation frameworks. Legacy metrics such as the System Usability Scale (SUS), Net Promoter Score (NPS), and task completion rate were engineered for deterministic, rule-based interfaces where identical inputs yield identical outputs. In AI-mediated systems -- spanning conversational agents, generative interfaces, and recommendation engines -- outputs are stochastic, context-sensitive, and temporally variable, rendering these metrics structurally insufficient. This paper introduces the Adaptive Dynamic UX Statistical Framework (ADUX-Stat), a novel evaluation model that reconceptualises usability as a probabilistic signal distribution rather than a static scalar score. ADUX-Stat integrates three original constructs: (1) Interaction Entropy Index (IEI), quantifying the unpredictability of AI responses from a user perception standpoint; (2) Temporal Drift Coefficient (TDC), measuring longitudinal degradation or improvement of perceived usability over interaction sessions; and (3) Bayesian Usability Confidence Score (BUCS), producing credible interval estimates of usability quality under uncertainty. The framework is validated conceptually against five established AI product categories. ADUX-Stat addresses a critical gap at the intersection of HCI research, statistical modelling, and AI product evaluation, offering a reproducible, field-deployable methodology for UX practitioners and researchers alike.

View on arXiv PDF

Similar