Learning Safety Constraints From Demonstration Using One-Class Decision Trees
This addresses safety concerns in deploying autonomous agents in physical environments, offering an incremental improvement through interpretable constraint learning.
The paper tackles the challenge of aligning autonomous agents with human safety values by learning constraints from expert demonstrations using one-class decision trees, resulting in interpretable constraint representations validated in synthetic and realistic driving environments.
The alignment of autonomous agents with human values is a pivotal challenge when deploying these agents within physical environments, where safety is an important concern. However, defining the agent's objective as a reward and/or cost function is inherently complex and prone to human errors. In response to this challenge, we present a novel approach that leverages one-class decision trees to facilitate learning from expert demonstrations. These decision trees provide a foundation for representing a set of constraints pertinent to the given environment as a logical formula in disjunctive normal form. The learned constraints are subsequently employed within an oracle constrained reinforcement learning framework, enabling the acquisition of a safe policy. In contrast to other methods, our approach offers an interpretable representation of the constraints, a vital feature in safety-critical environments. To validate the effectiveness of our proposed method, we conduct experiments in synthetic benchmark domains and a realistic driving environment.