LGAITHMay 10, 2022

Risk Preferences of Learning Algorithms

arXiv:2205.04619v3h-index: 9
Originality Incremental advance
AI Analysis

This addresses fairness and homogenization concerns in economic decision-making where learning algorithms are used, but it is incremental as it focuses on correcting a known bias in an existing algorithm.

The paper shows that the ε-Greedy learning algorithm exhibits emergent risk aversion by preferring lower-variance actions even when they have the same or lower expected payoff, and proposes two correction methods to restore risk-neutrality.

Agents' learning from feedback shapes economic outcomes, and many economic decision-makers today employ learning algorithms to make consequential choices. This note shows that a widely used learning algorithm, $\varepsilon$-Greedy, exhibits emergent risk aversion: it prefers actions with lower variance. When presented with actions of the same expectation, under a wide range of conditions, $\varepsilon$-Greedy chooses the lower-variance action with probability approaching one. This emergent preference can have wide-ranging consequences, ranging from concerns about fairness to homogenization, and holds transiently even when the riskier action has a strictly higher expected payoff. We discuss two methods to correct this bias. The first method requires the algorithm to reweight data as a function of how likely the actions were to be chosen. The second requires the algorithm to have optimistic estimates of actions for which it has not collected much data. We show that risk-neutrality is restored with these corrections.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes