PVP: Polar Representation Boost for 3D Semantic Occupancy Prediction
This work addresses a domain-specific problem for 3D perception tasks, offering an incremental improvement over existing polar coordinate methods.
The paper tackles the problem of feature distortion in polar coordinate-based representations for 3D semantic occupancy prediction by introducing the Polar Voxel Occupancy Predictor (PVP), which achieves significant improvements in mIoU and IoU metrics on the OpenOccupancy dataset.
Recently, polar coordinate-based representations have shown promise for 3D perceptual tasks. Compared to Cartesian methods, polar grids provide a viable alternative, offering better detail preservation in nearby spaces while covering larger areas. However, they face feature distortion due to non-uniform division. To address these issues, we introduce the Polar Voxel Occupancy Predictor (PVP), a novel 3D multi-modal predictor that operates in polar coordinates. PVP features two key design elements to overcome distortion: a Global Represent Propagation (GRP) module that integrates global spatial data into 3D volumes, and a Plane Decomposed Convolution (PD-Conv) that simplifies 3D distortions into 2D convolutions. These innovations enable PVP to outperform existing methods, achieving significant improvements in mIoU and IoU metrics on the OpenOccupancy dataset.