Traffic Control Gesture Recognition for Autonomous Vehicles
This work addresses a domain-specific problem for autonomous vehicles by providing a dataset and models for traffic control gesture recognition, but it is incremental as it builds on existing methods for gesture classification.
The authors tackled the lack of learning data for traffic control gesture recognition in autonomous vehicles by introducing a new dataset based on 3D body skeleton input, consisting of 250 sequences, and evaluated it with eight deep neural network models, achieving real-world quantitative evaluation.
A car driver knows how to react on the gestures of the traffic officers. Clearly, this is not the case for the autonomous vehicle, unless it has road traffic control gesture recognition functionalities. In this work, we address the limitation of the existing autonomous driving datasets to provide learning data for traffic control gesture recognition. We introduce a dataset that is based on 3D body skeleton input to perform traffic control gesture classification on every time step. Our dataset consists of 250 sequences from several actors, ranging from 16 to 90 seconds per sequence. To evaluate our dataset, we propose eight sequential processing models based on deep neural networks such as recurrent networks, attention mechanism, temporal convolutional networks and graph convolutional networks. We present an extensive evaluation and analysis of all approaches for our dataset, as well as real-world quantitative evaluation. The code and dataset is publicly available.