AILGMAAOMar 25

Trust as Monitoring: Evolutionary Dynamics of User Trust and AI Developer Behaviour

arXiv:2603.2474219.71 citationsh-index: 11
Predicted impact top 70% in AI · last 90 daysOriginality Incremental advance
AI Analysis

This work addresses AI governance for policymakers and developers, providing a formal model to support transparency and sanctions, but it is incremental as it builds on existing evolutionary frameworks.

The paper tackles the problem of AI safety by modeling trust as reduced monitoring in repeated interactions between users and AI developers, using evolutionary game theory to show that safe, widely adopted systems only emerge when penalties for unsafe behavior exceed safety costs and users can afford occasional monitoring.

AI safety is an increasingly urgent concern as the capabilities and adoption of AI systems grow. Existing evolutionary models of AI governance have primarily examined incentives for safe development and effective regulation, typically representing users' trust as a one-shot adoption choice rather than as a dynamic, evolving process shaped by repeated interactions. We instead model trust as reduced monitoring in a repeated, asymmetric interaction between users and AI developers, where checking AI behaviour is costly. Using evolutionary game theory, we study how user trust strategies and developer choices between safe (compliant) and unsafe (non-compliant) AI co-evolve under different levels of monitoring cost and institutional regimes. We complement the infinite-population replicator analysis with stochastic finite-population dynamics and reinforcement learning (Q-learning) simulations. Across these approaches, we find three robust long-run regimes: no adoption with unsafe development, unsafe but widely adopted systems, and safe systems that are widely adopted. Only the last is desirable, and it arises when penalties for unsafe behaviour exceed the extra cost of safety and users can still afford to monitor at least occasionally. Our results formally support governance proposals that emphasise transparency, low-cost monitoring, and meaningful sanctions, and they show that neither regulation alone nor blind user trust is sufficient to prevent evolutionary drift towards unsafe or low-adoption outcomes.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes