Yongyang Xu

CV
h-index3
3papers
64citations
Novelty42%
AI Score27

3 Papers

CVOct 12, 2022
LACV-Net: Semantic Segmentation of Large-Scale Point Cloud Scene via Local Adaptive and Comprehensive VLAD

Ziyin Zeng, Yongyang Xu, Zhong Xie et al.

Large-scale point cloud semantic segmentation is an important task in 3D computer vision, which is widely applied in autonomous driving, robotics, and virtual reality. Current large-scale point cloud semantic segmentation methods usually use down-sampling operations to improve computation efficiency and acquire point clouds with multi-resolution. However, this may cause the problem of missing local information. Meanwhile, it is difficult for networks to capture global information in large-scale distributed contexts. To capture local and global information effectively, we propose an end-to-end deep neural network called LACV-Net for large-scale point cloud semantic segmentation. The proposed network contains three main components: 1) a local adaptive feature augmentation module (LAFA) to adaptively learn the similarity of centroids and neighboring points to augment the local context; 2) a comprehensive VLAD module (C-VLAD) that fuses local features with multi-layer, multi-scale, and multi-resolution to represent a comprehensive global description vector; and 3) an aggregation loss function to effectively optimize the segmentation boundaries by constraining the adaptive weight from the LAFA module. Compared to state-of-the-art networks on several large-scale benchmark datasets, including S3DIS, Toronto3D, and SensatUrban, we demonstrated the effectiveness of the proposed network.

CVApr 3, 2023
Small but Mighty: Enhancing 3D Point Clouds Semantic Segmentation with U-Next Framework

Ziyin Zeng, Qingyong Hu, Zhong Xie et al.

We study the problem of semantic segmentation of large-scale 3D point clouds. In recent years, significant research efforts have been directed toward local feature aggregation, improved loss functions and sampling strategies. While the fundamental framework of point cloud semantic segmentation has been largely overlooked, with most existing approaches rely on the U-Net architecture by default. In this paper, we propose U-Next, a small but mighty framework designed for point cloud semantic segmentation. The key to this framework is to learn multi-scale hierarchical representations from semantically similar feature maps. Specifically, we build our U-Next by stacking multiple U-Net $L^1$ codecs in a nested and densely arranged manner to minimize the semantic gap, while simultaneously fusing the feature maps across scales to effectively recover the fine-grained details. We also devised a multi-level deep supervision mechanism to further smooth gradient propagation and facilitate network optimization. Extensive experiments conducted on three large-scale benchmarks including S3DIS, Toronto3D, and SensatUrban demonstrate the superiority and the effectiveness of the proposed U-Next architecture. Our U-Next architecture shows consistent and visible performance improvements across different tasks and baseline models, indicating its great potential to serve as a general framework for future research.

IVFeb 7, 2025
Bridging Scales in Map Generation: A scale-aware cascaded generative mapping framework for seamless and consistent multi-scale cartographic representation

Chenxing Sun, Yongyang Xu, Xuwei Xu et al.

Multi-scale tile maps are essential for geographic information services, serving as fundamental outcomes of surveying and cartographic workflows. While existing image generation networks can produce map-like outputs from remote sensing imagery, their emphasis on replicating texture rather than preserving geospatial features limits cartographic validity. Current approaches face two fundamental challenges: inadequate integration of cartographic generalization principles with dynamic multi-scale generation and spatial discontinuities arising from tile-wise generation. To address these limitations, we propose a scale-aware cartographic generation framework (SCGM) that leverages conditional guided diffusion and a multi-scale cascade architecture. The framework introduces three key innovations: a scale modality encoding mechanism to formalize map generalization relationships, a scale-driven conditional encoder for robust feature fusion, and a cascade reference mechanism ensuring cross-scale visual consistency. By hierarchically constraining large-scale map synthesis with small-scale structural priors, SCGM effectively mitigates edge artifacts while maintaining geographic fidelity. Comprehensive evaluations on cartographic benchmarks confirm the framework's ability to generate seamless multi-scale tile maps with enhanced spatial coherence and generalization-aware representation, demonstrating significant potential for emergency mapping and automated cartography applications.