LGJan 27, 2022

Modeling Human Exploration Through Resource-Rational Reinforcement Learning

arXiv:2201.11817v321 citations
Originality Highly original
AI Analysis

This work addresses the problem of improving exploration in AI agents for researchers and practitioners, offering a novel approach that bridges human cognition and machine learning, though it is incremental in building on resource-rational frameworks.

The authors tackled the challenge of equipping artificial agents with effective exploration mechanisms by hypothesizing that humans manage exploration-exploitation trade-offs through optimal use of limited computational resources. They meta-learned reinforcement learning algorithms that trade performance for shorter description length, resulting in models that better capture human exploration behavior than existing approaches like Boltzmann exploration, with effects demonstrated across brain-lesioned patients and cognitive development.

Equipping artificial agents with useful exploration mechanisms remains a challenge to this day. Humans, on the other hand, seem to manage the trade-off between exploration and exploitation effortlessly. In the present article, we put forward the hypothesis that they accomplish this by making optimal use of limited computational resources. We study this hypothesis by meta-learning reinforcement learning algorithms that sacrifice performance for a shorter description length (defined as the number of bits required to implement the given algorithm). The emerging class of models captures human exploration behavior better than previously considered approaches, such as Boltzmann exploration, upper confidence bound algorithms, and Thompson sampling. We additionally demonstrate that changing the description length in our class of models produces the intended effects: reducing description length captures the behavior of brain-lesioned patients while increasing it mirrors cognitive development during adolescence.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes