LG GT MLJun 6, 2022

Emergent specialization from participation dynamics and multi-learner retraining

Sarah Dean, Mihaela Curmei, Lillian J. Ratliff, Jamie Morgenstern, Maryam Fazel

arXiv:2206.02667v38.76 citationsh-index: 31Has Code

Originality Incremental advance

AI Analysis

This addresses the problem of emergent specialization in multi-learner systems for online service providers and users, offering a theoretical analysis with potential practical implications.

The paper analyzes risk-reducing dynamics in online services where users allocate participation to minimize individual risk and services update models to reduce risk on their current users, showing that asymptotically stable equilibria lead to segmented sub-populations and that the utilitarian social optimum is stable, with repeated myopic updates improving outcomes compared to prior work.

Numerous online services are data-driven: the behavior of users affects the system's parameters, and the system's parameters affect the users' experience of the service, which in turn affects the way users may interact with the system. For example, people may choose to use a service only for tasks that already works well, or they may choose to switch to a different service. These adaptations influence the ability of a system to learn about a population of users and tasks in order to improve its performance broadly. In this work, we analyze a class of such dynamics -- where users allocate their participation amongst services to reduce the individual risk they experience, and services update their model parameters to reduce the service's risk on their current user population. We refer to these dynamics as \emph{risk-reducing}, which cover a broad class of common model updates including gradient descent and multiplicative weights. For this general class of dynamics, we show that asymptotically stable equilibria are always segmented, with sub-populations allocated to a single learner. Under mild assumptions, the utilitarian social optimum is a stable equilibrium. In contrast to previous work, which shows that repeated risk minimization can result in (Hashimoto et al., 2018; Miller et al., 2021), we find that repeated myopic updates with multiple learners lead to better outcomes. We illustrate the phenomena via a simulated example initialized from real data.

View on arXiv PDF Code

Similar