Vision-based Uneven BEV Representation Learning with Polar Rasterization and Surface Estimation
This work addresses the challenge of generating accurate BEV maps from camera images for autonomous driving, offering an incremental improvement over existing methods with specific gains in segmentation tasks.
The paper tackles the problem of vision-based Bird's Eye View (BEV) representation learning for uneven surfaces by proposing PolarBEV, which uses polar rasterization and surface estimation to adapt to camera foreshortening, achieving state-of-the-art performance in BEV semantic and instance segmentation with real-time inference on a single GPU.
In this work, we propose PolarBEV for vision-based uneven BEV representation learning. To adapt to the foreshortening effect of camera imaging, we rasterize the BEV space both angularly and radially, and introduce polar embedding decomposition to model the associations among polar grids. Polar grids are rearranged to an array-like regular representation for efficient processing. Besides, to determine the 2D-to-3D correspondence, we iteratively update the BEV surface based on a hypothetical plane, and adopt height-based feature transformation. PolarBEV keeps real-time inference speed on a single 2080Ti GPU, and outperforms other methods for both BEV semantic segmentation and BEV instance segmentation. Thorough ablations are presented to validate the design. The code will be released at \url{https://github.com/SuperZ-Liu/PolarBEV}.