LGAIMay 21, 2022

Nuclear Norm Maximization Based Curiosity-Driven Learning

arXiv:2205.10484v25 citationsh-index: 15
Originality Incremental advance
AI Analysis

This work addresses the challenge of sparse extrinsic rewards in reinforcement learning for AI agents, offering a domain-specific improvement over existing curiosity-driven methods.

The paper tackles the problem of noisy intrinsic rewards in reinforcement learning by proposing a curiosity method based on nuclear norm maximization, which improves exploration novelty measurement and noise tolerance, achieving a human-normalized score of 1.09 on 26 Atari games, doubling that of competitive approaches.

To handle the sparsity of the extrinsic rewards in reinforcement learning, researchers have proposed intrinsic reward which enables the agent to learn the skills that might come in handy for pursuing the rewards in the future, such as encouraging the agent to visit novel states. However, the intrinsic reward can be noisy due to the undesirable environment's stochasticity and directly applying the noisy value predictions to supervise the policy is detrimental to improve the learning performance and efficiency. Moreover, many previous studies employ $\ell^2$ norm or variance to measure the exploration novelty, which will amplify the noise due to the square operation. In this paper, we address aforementioned challenges by proposing a novel curiosity leveraging the nuclear norm maximization (NNM), which can quantify the novelty of exploring the environment more accurately while providing high-tolerance to the noise and outliers. We conduct extensive experiments across a variety of benchmark environments and the results suggest that NNM can provide state-of-the-art performance compared with previous curiosity methods. On 26 Atari games subset, when trained with only intrinsic reward, NNM achieves a human-normalized score of 1.09, which doubles that of competitive intrinsic rewards-based approaches. Our code will be released publicly to enhance the reproducibility.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes