AIMAApr 2, 2020

Improving Confidence in the Estimation of Values and Norms

arXiv:2004.01056v11 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of aligning autonomous agents with human values and norms, which is crucial for safe integration into daily life, but the work is incremental as it builds on existing simulation-based approaches.

The paper tackles the problem of autonomous agents estimating human values and norms by analyzing their ability to profile simulated human agents in the ultimatum game, presenting two methods that increase confidence in these estimates, with one being more efficient for minimizing interactions.

Autonomous agents (AA) will increasingly be interacting with us in our daily lives. While we want the benefits attached to AAs, it is essential that their behavior is aligned with our values and norms. Hence, an AA will need to estimate the values and norms of the humans it interacts with, which is not a straightforward task when solely observing an agent's behavior. This paper analyses to what extent an AA is able to estimate the values and norms of a simulated human agent (SHA) based on its actions in the ultimatum game. We present two methods to reduce ambiguity in profiling the SHAs: one based on search space exploration and another based on counterfactual analysis. We found that both methods are able to increase the confidence in estimating human values and norms, but differ in their applicability, the latter being more efficient when the number of interactions with the agent is to be minimized. These insights are useful to improve the alignment of AAs with human values and norms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes