A Generalist Dynamics Model for Control
This work addresses the challenge of building generalist control models for robotics or AI systems, presenting a promising but incremental step toward foundation models.
The paper tackles the problem of using transformer sequence models as dynamics models for control, finding they generalize well to unseen environments in few-shot and zero-shot settings, with results showing they outperform direct policy generalization and baseline models in single-environment learning.
We investigate the use of transformer sequence models as dynamics models (TDMs) for control. We find that TDMs exhibit strong generalization capabilities to unseen environments, both in a few-shot setting, where a generalist TDM is fine-tuned with small amounts of data from the target environment, and in a zero-shot setting, where a generalist TDM is applied to an unseen environment without any further training. Here, we demonstrate that generalizing system dynamics can work much better than generalizing optimal behavior directly as a policy. Additional results show that TDMs also perform well in a single-environment learning setting when compared to a number of baseline models. These properties make TDMs a promising ingredient for a foundation model of control.