Learning Shared Dynamics with Meta-World Models
This work addresses the challenge of building abstract mental models for AI agents to understand and act in varied environments, though it appears incremental in applying multi-task learning to existing world model concepts.
The paper tackles the problem of learning shared physical dynamics across visually diverse environments using meta-world models, achieving successful capture of common dynamics in Atari Games and enabling agents to perform visual self-recognition in mirrored environments.
Humans have consciousness as the ability to perceive events and objects: a mental model of the world developed from the most impoverished of visual stimuli, enabling humans to make rapid decisions and take actions. Although spatial and temporal aspects of different scenes are generally diverse, the underlying physics among environments still work the same way, thus learning an abstract description of shared physical dynamics helps human to understand the world. In this paper, we explore building this mental world with neural network models through multi-task learning, namely the meta-world model. We show through extensive experiments that our proposed meta-world models successfully capture the common dynamics over the compact representations of visually different environments from Atari Games. We also demonstrate that agents equipped with our meta-world model possess the ability of visual self-recognition, i.e., recognize themselves from the reflected mirrored environment derived from the classic mirror self-recognition test (MSR).