Learning State Representations in Complex Systems with Multimodal Data
This work provides a benchmark for researchers in reinforcement learning and optimal control, though it is incremental as it focuses on dataset creation rather than novel methods.
The authors tackled the lack of a large-scale standard dataset for representation learning in complex systems with multimodal data by presenting a dataset and evaluation framework for airplane landing tasks, comparing several approaches in terms of supervised learning quality and disentanglement scores.
Representation learning becomes especially important for complex systems with multimodal data sources such as cameras or sensors. Recent advances in reinforcement learning and optimal control make it possible to design control algorithms on these latent representations, but the field still lacks a large-scale standard dataset for unified comparison. In this work, we present a large-scale dataset and evaluation framework for representation learning for the complex task of landing an airplane. We implement and compare several approaches to representation learning on this dataset in terms of the quality of simple supervised learning tasks and disentanglement scores. The resulting representations can be used for further tasks such as anomaly detection, optimal control, model-based reinforcement learning, and other applications.