AIMLJun 5, 2018

Boredom-driven curious learning by Homeo-Heterostatic Value Gradients

arXiv:1806.01502v11 citations
Originality Incremental advance
AI Analysis

This addresses exploration challenges in reinforcement learning agents, though it appears incremental as it builds on existing intrinsic motivation frameworks.

The paper tackles the problem of effective exploration in reinforcement learning by reconciling boredom and curiosity through the Homeo-Heterostatic Value Gradients (HHVG) algorithm, resulting in agents that consistently outperformed other variants in model building benchmarks.

This paper presents the Homeo-Heterostatic Value Gradients (HHVG) algorithm as a formal account on the constructive interplay between boredom and curiosity which gives rise to effective exploration and superior forward model learning. We envisaged actions as instrumental in agent's own epistemic disclosure. This motivated two central algorithmic ingredients: devaluation and devaluation progress, both underpin agent's cognition concerning intrinsically generated rewards. The two serve as an instantiation of homeostatic and heterostatic intrinsic motivation. A key insight from our algorithm is that the two seemingly opposite motivations can be reconciled---without which exploration and information-gathering cannot be effectively carried out. We supported this claim with empirical evidence, showing that boredom-enabled agents consistently outperformed other curious or explorative agent variants in model building benchmarks based on self-assisted experience accumulation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes