Driving Behavior Explanation with Multi-level Fusion
This work addresses the crucial problem of providing explainability for autonomous driving systems, which is important for trust and safety in the development of self-driving cars.
This paper introduces BEEF, a deep architecture designed to explain the behavior of a trajectory prediction model in autonomous vehicles. It learns to fuse multi-level features, supervised by human driving decision justifications, to generate high-level driving explanations as the vehicle operates.
In this era of active development of autonomous vehicles, it becomes crucial to provide driving systems with the capacity to explain their decisions. In this work, we focus on generating high-level driving explanations as the vehicle drives. We present BEEF, for BEhavior Explanation with Fusion, a deep architecture which explains the behavior of a trajectory prediction model. Supervised by annotations of human driving decisions justifications, BEEF learns to fuse features from multiple levels. Leveraging recent advances in the multi-modal fusion literature, BEEF is carefully designed to model the correlations between high-level decisions features and mid-level perceptual features. The flexibility and efficiency of our approach are validated with extensive experiments on the HDD and BDD-X datasets.