Enhancing Rotation-Invariant 3D Learning with Global Pose Awareness and Attention Mechanisms
This addresses a specific limitation in 3D point cloud processing for tasks requiring fine-grained spatial discrimination, representing a novel method for a known bottleneck rather than a broad breakthrough.
The paper tackled the problem of rotation-invariant 3D learning losing global pose information, which causes failures like Wing-tip feature collapse, by introducing Shadow-informed Pose Features and Rotation-invariant Attention Convolution to preserve global awareness while maintaining invariance, resulting in substantial performance gains on classification and part segmentation benchmarks under arbitrary rotations.
Recent advances in rotation-invariant (RI) learning for 3D point clouds typically replace raw coordinates with handcrafted RI features to ensure robustness under arbitrary rotations. However, these approaches often suffer from the loss of global pose information, making them incapable of distinguishing geometrically similar but spatially distinct structures. We identify that this limitation stems from the restricted receptive field in existing RI methods, leading to Wing-tip feature collapse, a failure to differentiate symmetric components (e.g., left and right airplane wings) due to indistinguishable local geometries. To overcome this challenge, we introduce the Shadow-informed Pose Feature (SiPF), which augments local RI descriptors with a globally consistent reference point (referred to as the 'shadow') derived from a learned shared rotation. This mechanism enables the model to preserve global pose awareness while maintaining rotation invariance. We further propose Rotation-invariant Attention Convolution (RIAttnConv), an attention-based operator that integrates SiPFs into the feature aggregation process, thereby enhancing the model's capacity to distinguish structurally similar components. Additionally, we design a task-adaptive shadow locating module based on the Bingham distribution over unit quaternions, which dynamically learns the optimal global rotation for constructing consistent shadows. Extensive experiments on 3D classification and part segmentation benchmarks demonstrate that our approach substantially outperforms existing RI methods, particularly in tasks requiring fine-grained spatial discrimination under arbitrary rotations.