OPTNet: Ordering Point Transformer Network for Post-disaster 3D Semantic Segmentation
For post-disaster damage assessment, OPTNet provides a more efficient and accurate 3D semantic segmentation method by learning optimal point orderings, addressing a key limitation of static serialization approaches.
OPTNet introduces a learnable Point Sorter module with a self-supervised ordering loss to dynamically predict optimal permutations for window-based attention in 3D point cloud segmentation, achieving significant improvements over state-of-the-art on the 3DAeroRelief dataset.
Post-disaster damage assessment requires rapid and accurate semantic segmentation of 3D point clouds to identify critical infrastructure such as damaged buildings and roads. Early Point Transformers (e.g., PTv1, PTv2) relied on computationally expensive neighbor searching (k-NN) and Farthest Point Sampling (FPS). To improve efficiency, recent architectures like Point Transformer V3 (PTv3) adopted static serialization methods, such as Hilbert curves or Z-order, to organize unstructured points for window-based attention. However, these fixed orderings are not optimal for capturing the complex geometry of disaster scenes. In this paper, we propose OPTNet (Ordering Point Transformer Network), which introduces a learnable Point Sorter module. OPTNet utilizes a self-supervised ordering loss to dynamically predict an optimal permutation that maximizes the locality of the attention mechanism. We evaluate our method on the 3DAeroRelief dataset, significantly outperforming state-of-the-art baselines.