PRISE: LLM-Style Sequence Compression for Learning Temporal Action Abstractions in Control
This work addresses the challenge of knowledge sharing in robotic control by introducing a novel compression-based method for skill discovery, offering incremental improvements in imitation learning efficiency.
The paper tackles the problem of learning temporal action abstractions for sequential decision making by framing it as a sequence compression task, using byte pair encoding from LLMs to discover high-level skills from robotic manipulation demonstrations, resulting in significant performance boosts in multitask and few-shot imitation learning.
Temporal action abstractions, along with belief state representations, are a powerful knowledge sharing mechanism for sequential decision making. In this work, we propose a novel view that treats inducing temporal action abstractions as a sequence compression problem. To do so, we bring a subtle but critical component of LLM training pipelines -- input tokenization via byte pair encoding (BPE) -- to the seemingly distant task of learning skills of variable time span in continuous control domains. We introduce an approach called Primitive Sequence Encoding (PRISE) that combines continuous action quantization with BPE to learn powerful action abstractions. We empirically show that high-level skills discovered by PRISE from a multitask set of robotic manipulation demonstrations significantly boost the performance of both multitask imitation learning as well as few-shot imitation learning on unseen tasks. Our code is released at https://github.com/FrankZheng2022/PRISE.