Towards Machines that Trust: AI Agents Learn to Trust in the Trust Game
This work addresses the challenge of modeling trust in AI for applications in social interactions, but it is incremental as it builds on existing trust game frameworks without introducing new methods.
The paper tackled the problem of understanding how trust emerges in AI agents by analyzing and simulating the trust game using reinforcement learning, providing a mathematical basis for trust emergence.
Widely considered a cornerstone of human morality, trust shapes many aspects of human social interactions. In this work, we present a theoretical analysis of the $\textit{trust game}$, the canonical task for studying trust in behavioral and brain sciences, along with simulation results supporting our analysis. Specifically, leveraging reinforcement learning (RL) to train our AI agents, we systematically investigate learning trust under various parameterizations of this task. Our theoretical analysis, corroborated by the simulations results presented, provides a mathematical basis for the emergence of trust in the trust game.