Palm up: Playing in the Latent Manifold for Unsupervised Pretraining
This work addresses the challenge of unsupervised pretraining for AI systems by enabling exploratory learning from static data, which could benefit researchers in representation learning, but it appears incremental as it builds on existing generative models and reinforcement learning methods.
The paper tackles the problem of learning representations by combining static datasets with exploratory behavior, proposing an algorithm that uses a pretrained generative model's latent space as an environment for unsupervised reinforcement learning. The result is that the learned representations can be successfully transferred to downstream tasks in vision and reinforcement learning domains, though no concrete numbers are provided.
Large and diverse datasets have been the cornerstones of many impressive advancements in artificial intelligence. Intelligent creatures, however, learn by interacting with the environment, which changes the input sensory signals and the state of the environment. In this work, we aim to bring the best of both worlds and propose an algorithm that exhibits an exploratory behavior whilst it utilizes large diverse datasets. Our key idea is to leverage deep generative models that are pretrained on static datasets and introduce a dynamic model in the latent space. The transition dynamics simply mixes an action and a random sampled latent. It then applies an exponential moving average for temporal persistency, the resulting latent is decoded to image using pretrained generator. We then employ an unsupervised reinforcement learning algorithm to explore in this environment and perform unsupervised representation learning on the collected data. We further leverage the temporal information of this data to pair data points as a natural supervision for representation learning. Our experiments suggest that the learned representations can be successfully transferred to downstream tasks in both vision and reinforcement learning domains.