LGAIITMLJun 17, 2019

Hierarchical Soft Actor-Critic: Adversarial Exploration via Mutual Information Optimization

arXiv:1906.07122v14 citations
Originality Incremental advance
AI Analysis

This addresses exploration challenges in hierarchical reinforcement learning, though it appears incremental as an extension of existing methods.

The paper tackles the problem of encouraging exploration in hierarchical deep Q-networks by proposing a novel extension of soft actor-critics that uses mutual information optimization, resulting in an adversarial framework where meta-controller and controller cooperate on rewards while playing minimax over mutual information.

We describe a novel extension of soft actor-critics for hierarchical Deep Q-Networks (HDQN) architectures using mutual information metric. The proposed extension provides a suitable framework for encouraging explorations in such hierarchical networks. A natural utilization of this framework is an adversarial setting, where meta-controller and controller play minimax over the mutual information objective but cooperate on maximizing expected rewards.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes