CompoSuite: A Compositional Reinforcement Learning Benchmark
This provides a new benchmark for evaluating compositional generalization in reinforcement learning, addressing a key challenge in scaling to diverse problems, though it is incremental as it builds on existing multi-task RL frameworks.
The authors introduced CompoSuite, a simulated robotic manipulation benchmark for compositional multi-task reinforcement learning, which generates hundreds of tasks by varying robot, object, objective, and obstacle elements, and they benchmarked existing algorithms to expose their shortcomings in compositional generalization.
We present CompoSuite, an open-source simulated robotic manipulation benchmark for compositional multi-task reinforcement learning (RL). Each CompoSuite task requires a particular robot arm to manipulate one individual object to achieve a task objective while avoiding an obstacle. This compositional definition of the tasks endows CompoSuite with two remarkable properties. First, varying the robot/object/objective/obstacle elements leads to hundreds of RL tasks, each of which requires a meaningfully different behavior. Second, RL approaches can be evaluated specifically for their ability to learn the compositional structure of the tasks. This latter capability to functionally decompose problems would enable intelligent agents to identify and exploit commonalities between learning tasks to handle large varieties of highly diverse problems. We benchmark existing single-task, multi-task, and compositional learning algorithms on various training settings, and assess their capability to compositionally generalize to unseen tasks. Our evaluation exposes the shortcomings of existing RL approaches with respect to compositionality and opens new avenues for investigation.