LGApr 13, 2022

Modularity benefits reinforcement learning agents with competing homeostatic drives

arXiv:2204.06608v14 citationsh-index: 100
Originality Incremental advance
AI Analysis

This work addresses the challenge of multi-objective reinforcement learning for agents with competing homeostatic drives, presenting an incremental improvement over standard methods.

The paper tackled the problem of balancing conflicting needs in reinforcement learning by comparing a monolithic deep Q-network to a modular network with dedicated Q-learners for each variable. The modular agent showed benefits such as requiring minimal exploration, improved sample efficiency, and greater robustness to perturbations.

The problem of balancing conflicting needs is fundamental to intelligence. Standard reinforcement learning algorithms maximize a scalar reward, which requires combining different objective-specific rewards into a single number. Alternatively, different objectives could also be combined at the level of action value, such that specialist modules responsible for different objectives submit different action suggestions to a decision process, each based on rewards that are independent of one another. In this work, we explore the potential benefits of this alternative strategy. We investigate a biologically relevant multi-objective problem, the continual homeostasis of a set of variables, and compare a monolithic deep Q-network to a modular network with a dedicated Q-learner for each variable. We find that the modular agent: a) requires minimal exogenously determined exploration; b) has improved sample efficiency; and c) is more robust to out-of-domain perturbation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes