AILGNEOct 21, 2020

Meta-trained agents implement Bayes-optimal agents

arXiv:2010.11223v152 citations
Originality Highly original
AI Analysis

This work provides empirical validation for a theoretical claim, potentially enabling approximation of Bayes-optimal agents in complex task distributions where tractable models are lacking.

The paper investigates whether memory-based meta-learning produces agents that behave Bayes-optimally, showing empirically that meta-learned and Bayes-optimal agents share similar computational structures and that Bayes-optimal agents are fixed points of meta-learning dynamics.

Memory-based meta-learning is a powerful technique to build agents that adapt fast to any task within a target distribution. A previous theoretical study has argued that this remarkable performance is because the meta-training protocol incentivises agents to behave Bayes-optimally. We empirically investigate this claim on a number of prediction and bandit tasks. Inspired by ideas from theoretical computer science, we show that meta-learned and Bayes-optimal agents not only behave alike, but they even share a similar computational structure, in the sense that one agent system can approximately simulate the other. Furthermore, we show that Bayes-optimal agents are fixed points of the meta-learning dynamics. Our results suggest that memory-based meta-learning might serve as a general technique for numerically approximating Bayes-optimal agents - that is, even for task distributions for which we currently don't possess tractable models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes