Personal Universes: A Solution to the Multi-Agent Value Alignment Problem
This addresses the challenge of merging incompatible preferences for AI safety researchers, but it is incremental as it builds on assumed prior solutions.
The paper tackles the multi-agent value alignment problem in AI safety by proposing a solution that optimally aligns with individual user preferences, assuming the value extraction problem is solved.
AI Safety researchers attempting to align values of highly capable intelligent systems with those of humanity face a number of challenges including personal value extraction, multi-agent value merger and finally in-silico encoding. State-of-the-art research in value alignment shows difficulties in every stage in this process, but merger of incompatible preferences is a particularly difficult challenge to overcome. In this paper we assume that the value extraction problem will be solved and propose a possible way to implement an AI solution which optimally aligns with individual preferences of each user. We conclude by analyzing benefits and limitations of the proposed approach.