LGAINov 3, 2021

Model-Based Episodic Memory Induces Dynamic Hybrid Controls

arXiv:2111.02104v225 citations
Originality Highly original
AI Analysis

This work addresses sample efficiency for reinforcement learning agents, offering a novel hybrid control architecture that integrates multiple learning paradigms.

The paper tackled the problem of sample inefficiency in reinforcement learning by introducing a model-based episodic memory that estimates trajectory values to guide policy learning, resulting in significantly faster and better learning across various environments, including stochastic and non-Markovian settings.

Episodic control enables sample efficiency in reinforcement learning by recalling past experiences from an episodic memory. We propose a new model-based episodic memory of trajectories addressing current limitations of episodic control. Our memory estimates trajectory values, guiding the agent towards good policies. Built upon the memory, we construct a complementary learning model via a dynamic hybrid control unifying model-based, episodic and habitual learning into a single architecture. Experiments demonstrate that our model allows significantly faster and better learning than other strong reinforcement learning agents across a variety of environments including stochastic and non-Markovian settings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes