F2BEV: Bird's Eye View Generation from Surround-View Fisheye Camera Images for Automated Driving
This addresses a specific problem in automated driving perception by providing a baseline for handling fisheye distortions, but it is incremental as it builds on existing BEV generation methods.
The paper tackles the challenge of generating Bird's Eye View (BEV) representations from surround-view fisheye camera images, which are distorted, by introducing F2BEV, a method that produces better BEV height and segmentation maps in terms of IoU than a state-of-the-art baseline on a synthetic dataset.
Bird's Eye View (BEV) representations are tremendously useful for perception-related automated driving tasks. However, generating BEVs from surround-view fisheye camera images is challenging due to the strong distortions introduced by such wide-angle lenses. We take the first step in addressing this challenge and introduce a baseline, F2BEV, to generate discretized BEV height maps and BEV semantic segmentation maps from fisheye images. F2BEV consists of a distortion-aware spatial cross attention module for querying and consolidating spatial information from fisheye image features in a transformer-style architecture followed by a task-specific head. We evaluate single-task and multi-task variants of F2BEV on our synthetic FB-SSEM dataset, all of which generate better BEV height and segmentation maps (in terms of the IoU) than a state-of-the-art BEV generation method operating on undistorted fisheye images. We also demonstrate discretized height map generation from real-world fisheye images using F2BEV. Our dataset is publicly available at https://github.com/volvo-cars/FB-SSEM-dataset