AIIRAug 25, 2022

Modelling the Recommender Alignment Problem

arXiv:2208.12299v1
Originality Synthesis-oriented
AI Analysis

This work addresses the alignment of recommender systems with user and societal goals, which is a pressing issue in online platforms, but it is incremental as it provides a preliminary modeling approach rather than a full solution.

The paper tackles the recommender alignment problem, where recommender systems optimize for easily measurable metrics like engagement, leading to negative societal side-effects such as polarization and addiction. The authors propose a modeling framework and conduct a toy experiment showing that engagement-maximizing recommenders generally lead to worse outcomes than aligned ones, and competition between recommenders can improve societal welfare.

Recommender systems (RS) mediate human experience online. Most RS act to optimize metrics that are imperfectly aligned with the best-interest of users but are easy to measure, like ad-clicks and user engagement. This has resulted in a host of hard-to-measure side-effects: political polarization, addiction, fake news. RS design faces a recommender alignment problem: that of aligning recommendations with the goals of users, system designers, and society as a whole. But how do we test and compare potential solutions to align RS? Their massive scale makes them costly and risky to test in deployment. We synthesized a simple abstract modelling framework to guide future work. To illustrate it, we construct a toy experiment where we ask: "How can we evaluate the consequences of using user retention as a reward function?" To answer the question, we learn recommender policies that optimize reward functions by controlling graph dynamics on a toy environment. Based on the effects that trained recommenders have on their environment, we conclude that engagement maximizers generally lead to worse outcomes than aligned recommenders but not always. After learning, we examine competition between RS as a potential solution to RS alignment. We find that it generally makes our toy-society better-off than it would be under the absence of recommendation or engagement maximizers. In this work, we aimed for a broad scope, touching superficially on many different points to shed light on how an end-to-end study of reward functions for recommender systems might be done. Recommender alignment is a pressing and important problem. Attempted solutions are sure to have far-reaching impacts. Here, we take a first step in developing methods to evaluating and comparing solutions with respect to their impacts on society.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes