CVITJun 8, 2022

Robust Environment Perception for Automated Driving: A Unified Learning Pipeline for Visual-Infrared Object Detection

arXiv:2206.03943v117 citationsh-index: 25
Originality Incremental advance
AI Analysis

This work addresses the problem of reliable environment perception in varying light conditions for automated driving systems, representing an incremental improvement with a novel fusion method.

The paper tackles robust object detection for automated driving by fusing visual and thermal sensor data, achieving an 82.9% mAP and outperforming the state-of-the-art by 10%.

The RGB complementary metal-oxidesemiconductor (CMOS) sensor works within the visible light spectrum. Therefore it is very sensitive to environmental light conditions. On the contrary, a long-wave infrared (LWIR) sensor operating in 8-14 micro meter spectral band, functions independent of visible light. In this paper, we exploit both visual and thermal perception units for robust object detection purposes. After delicate synchronization and (cross-) labeling of the FLIR [1] dataset, this multi-modal perception data passes through a convolutional neural network (CNN) to detect three critical objects on the road, namely pedestrians, bicycles, and cars. After evaluation of RGB and infrared (thermal and infrared are often used interchangeably) sensors separately, various network structures are compared to fuse the data at the feature level effectively. Our RGB-thermal (RGBT) fusion network, which takes advantage of a novel entropy-block attention module (EBAM), outperforms the state-of-the-art network [2] by 10% with 82.9% mAP.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes