Hierarchical Primitive Composition: Simultaneous Activation of Skills with Inconsistent Action Dimensions in Multiple Hierarchies
This work addresses policy modularization for complex robotic tasks, offering incremental improvements in reusability and interpretability.
The paper tackles the inefficiency of using a single policy for complex long-horizon tasks in deep reinforcement learning by proposing a method for simultaneous activation of skills with inconsistent action dimensions across multiple hierarchies, demonstrated on a 6 DoF manipulator pick-and-place task.
Deep reinforcement learning has shown its effectiveness in various applications, providing a promising direction for solving tasks with high complexity. However, naively applying classical RL for learning a complex long-horizon task with a single control policy is inefficient. Thus, policy modularization tackles this problem by learning a set of modules that are mapped to primitives and properly orchestrating them. In this study, we further expand the discussion by incorporating simultaneous activation of the skills and structuring them into multiple hierarchies in a recursive fashion. Moreover, we sought to devise an algorithm that can properly orchestrate the skills with different action spaces via multiplicative Gaussian distributions, which highly increases the reusability. By exploiting the modularity, interpretability can also be achieved by observing the modules that are used in the new task if each of the skills is known. We demonstrate how the proposed scheme can be employed in practice by solving a pick and place task with a 6 DoF manipulator, and examine the effects of each property from ablation studies.