AILGNov 5, 2021

An Algorithmic Theory of Metacognition in Minds and Machines

arXiv:2111.03745v11 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of metacognition in both cognitive science and machine learning, offering a novel theoretical framework that is incremental in its application to existing RL methods.

The paper tackles the problem of how agents can detect their own suboptimal actions without external information, by proposing a metacognitive theory based on reinforcement learning trade-offs, resulting in a Metacognitive Actor Critic (MAC) that demonstrates self-detection of some suboptimal actions.

Humans sometimes choose actions that they themselves can identify as sub-optimal, or wrong, even in the absence of additional information. How is this possible? We present an algorithmic theory of metacognition based on a well-understood trade-off in reinforcement learning (RL) between value-based RL and policy-based RL. To the cognitive (neuro)science community, our theory answers the outstanding question of why information can be used for error detection but not for action selection. To the machine learning community, our proposed theory creates a novel interaction between the Actor and Critic in Actor-Critic agents and notes a novel connection between RL and Bayesian Optimization. We call our proposed agent the Metacognitive Actor Critic (MAC). We conclude with showing how to create metacognition in machines by implementing a deep MAC and showing that it can detect (some of) its own suboptimal actions without external information or delay.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes