Learning To Explore With Predictive World Model Via Self-Supervised Learning
This addresses the challenge of designing reward functions for each environment, which is a key problem for developing autonomous AI agents, though it appears incremental as it builds on existing intrinsic motivation approaches.
The paper tackles the problem of enabling autonomous agents to learn complex behaviors without human-designed rewards by using intrinsic motivation and a predictive world model. The result is superior performance compared to state-of-the-art methods in 18 Atari games, including cases with dense and sparse rewards.
Autonomous artificial agents must be able to learn behaviors in complex environments without humans to design tasks and rewards. Designing these functions for each environment is not feasible, thus, motivating the development of intrinsic reward functions. In this paper, we propose using several cognitive elements that have been neglected for a long time to build an internal world model for an intrinsically motivated agent. Our agent performs satisfactory iterations with the environment, learning complex behaviors without needing previously designed reward functions. We used 18 Atari games to evaluate what cognitive skills emerge in games that require reactive and deliberative behaviors. Our results show superior performance compared to the state-of-the-art in many test cases with dense and sparse rewards.