19.9CVApr 23
SparseGF: A Height-Aware Sparse Segmentation Framework with Context Compression for Robust Ground Filtering Across Urban to Natural ScenesNannan Qin, Pengjie Tao, Haiyan Guan et al.
High-quality digital terrain models derived from airborne laser scanning (ALS) data are essential for a wide range of geospatial analyses, and their generation typically relies on robust ground filtering (GF) to separate point clouds across diverse landscapes into ground and non-ground parts. Although current deep-learning-based GF methods have demonstrated impressive performance, especially in specific challenging terrains, their cross-scene generalization remains limited by two persistent issues: the context-detail dilemma in large-scale processing due to limited computational resources, and the random misclassification of tall objects arising from classification-only optimization. To overcome these limitations, we propose SparseGF, a height-aware sparse segmentation framework enhanced with context compression. It is built upon three key innovations: (1) a convex-mirror-inspired context compression module that condenses expansive contexts into compact representations while preserving central details; (2) a hybrid sparse voxel-point network architecture that effectively interprets compressed representations while mitigating compression-induced geometric distortion; and (3) a height-aware loss function that explicitly enforces topographic elevation priors during training to suppress random misclassification of tall objects. Extensive evaluations on two large-scale ALS benchmark datasets demonstrate that SparseGF delivers robust GF across urban to natural terrains, achieving leading performance in complex urban scenes, competitive results on mixed terrains, and moderate yet non-catastrophic accuracy in densely forested steep areas. This work offers new insights into deep-learning-based GF research and encourages further exploration toward truly cross-scene generalization for large-scale environmental monitoring.
36.9CVMar 21Code
SupScene: Scene-Structured Overlap Supervision for Image Retrieval in Unconstrained SfMXulei Shi, Maoyu Wang, Yuning Peng et al.
Image retrieval is a critical step for reducing the quadratic cost of image matching in unconstrained Structure-from-Motion (SfM). Unlike generic image retrieval, however, the relevant goal of SfM is to identify geometrically matchable image pairs rather than merely semantically similar images. Prevailing methods are largely trained under anchor-centric tuple guidance, which organizes the training around isolated tuples and under-utilizes the dense, graded overlap structure naturally established within a SfM scene. In this work, we present SupScene, a scene-structured training framework that samples connected local subgraphs from SfM overlap graphs and jointly supervises all valid within-subgraph pairwise relations. To explicitly align the trained descriptor with geometric co-visibility, we further introduce an overlap-ordered objective that combines multi-similarity optimization with a continuous relative-overlap ranking term. In addition, the proposed framework is instantiated with a lightweight Structural Context Probe Pooling (SCPP) head that aggregates complementary structural responses into a compact global descriptor. Extensive experimental results on multiple benchmarks demonstrate that our method can significantly improve overall retrieval performance and enhance the completeness of downstream SfM reconstructions. Code and models are available at https://github.com/Suxilan/SupScene.
CVNov 29, 2020
Learning geometry-image representation for 3D point cloud generationLei Wang, Yuchun Huang, Pengjie Tao et al.
We study the problem of generating point clouds of 3D objects. Instead of discretizing the object into 3D voxels with huge computational cost and resolution limitations, we propose a novel geometry image based generator (GIG) to convert the 3D point cloud generation problem to a 2D geometry image generation problem. Since the geometry image is a completely regular 2D array that contains the surface points of the 3D object, it leverages both the regularity of the 2D array and the geodesic neighborhood of the 3D surface. Thus, one significant benefit of our GIG is that it allows us to directly generate the 3D point clouds using efficient 2D image generation networks. Experiments on both rigid and non-rigid 3D object datasets have demonstrated the promising performance of our method to not only create plausible and novel 3D objects, but also learn a probabilistic latent space that well supports the shape editing like interpolation and arithmetic.