Prototype-Based Low Altitude UAV Semantic Segmentation
This work addresses efficient semantic segmentation for UAV applications, which is important for edge devices with limited resources, though it appears incremental in its methodological contributions.
The paper tackles semantic segmentation of low-altitude UAV imagery by proposing PBSeg, an efficient prototype-based framework that addresses computational overhead and fine-grained detail capture. It achieves 71.86% mIoU on UAVid and 80.92% mIoU on UDD6 while maintaining computational efficiency.
Semantic segmentation of low-altitude UAV imagery presents unique challenges due to extreme scale variations, complex object boundaries, and limited computational resources on edge devices. Existing transformer-based segmentation methods achieve remarkable performance but incur high computational overhead, while lightweight approaches struggle to capture fine-grained details in high-resolution aerial scenes. To address these limitations, we propose PBSeg, an efficient prototype-based segmentation framework tailored for UAV applications. PBSeg introduces a novel prototype-based cross-attention (PBCA) that exploits feature redundancy to reduce computational complexity while maintaining segmentation quality. The framework incorporates an efficient multi-scale feature extraction module that combines deformable convolutions (DConv) with context-aware modulation (CAM) to capture both local details and global semantics. Experiments on two challenging UAV datasets demonstrate the effectiveness of the proposed approach. PBSeg achieves 71.86\% mIoU on UAVid and 80.92\% mIoU on UDD6, establishing competitive performance while maintaining computational efficiency. Code is available at https://github.com/zhangda1018/PBSeg.