CVLGIVDec 4, 2020

Detecting 32 Pedestrian Attributes for Autonomous Vehicles

arXiv:2012.02647v232 citations
AI Analysis

This work is significant for improving the safety and decision-making capabilities of autonomous vehicles by providing a comprehensive understanding of pedestrian attributes and behavior.

This paper tackles the problem of simultaneously detecting pedestrians and recognizing 32 pedestrian attributes, including road crossing forecasting, from a single image for autonomous vehicles. The proposed Multi-Task Learning model, using a composite field framework and a novel fork-normalization technique to address gradient scale issues, achieves competitive detection and attribute recognition results on the JAAD dataset.

Pedestrians are arguably one of the most safety-critical road users to consider for autonomous vehicles in urban areas. In this paper, we address the problem of jointly detecting pedestrians and recognizing 32 pedestrian attributes from a single image. These encompass visual appearance and behavior, and also include the forecasting of road crossing, which is a main safety concern. For this, we introduce a Multi-Task Learning (MTL) model relying on a composite field framework, which achieves both goals in an efficient way. Each field spatially locates pedestrian instances and aggregates attribute predictions over them. This formulation naturally leverages spatial context, making it well suited to low resolution scenarios such as autonomous driving. By increasing the number of attributes jointly learned, we highlight an issue related to the scales of gradients, which arises in MTL with numerous tasks. We solve it by normalizing the gradients coming from different objective functions when they join at the fork in the network architecture during the backward pass, referred to as fork-normalization. Experimental validation is performed on JAAD, a dataset providing numerous attributes for pedestrian analysis from autonomous vehicles, and shows competitive detection and attribute recognition results, as well as a more stable MTL training.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes