Rethinking Top Probability from Multi-view for Distracted Driver Behaviour Localization
This addresses distracted driving detection for automotive safety systems, but is incremental as it builds on existing probability-based localization methods.
The paper tackles distracted driver behavior localization in naturalistic driving videos by using a self-supervised action recognition model with multi-view ensemble predictions and conditional post-processing, achieving sixth place on a 2024 AI City Challenge leaderboard.
Naturalistic driving action localization task aims to recognize and comprehend human behaviors and actions from video data captured during real-world driving scenarios. Previous studies have shown great action localization performance by applying a recognition model followed by probability-based post-processing. Nevertheless, the probabilities provided by the recognition model frequently contain confused information causing challenge for post-processing. In this work, we adopt an action recognition model based on self-supervise learning to detect distracted activities and give potential action probabilities. Subsequently, a constraint ensemble strategy takes advantages of multi-camera views to provide robust predictions. Finally, we introduce a conditional post-processing operation to locate distracted behaviours and action temporal boundaries precisely. Experimenting on test set A2, our method obtains the sixth position on the public leaderboard of track 3 of the 2024 AI City Challenge.