SAPA: Similarity-Aware Point Affiliation for Feature Upsampling
This work addresses feature upsampling challenges for dense prediction tasks in computer vision, offering a novel method that improves performance incrementally.
The paper tackles the problem of feature upsampling in dense prediction tasks by introducing a similarity-aware point affiliation method to generate upsampling kernels that enhance semantic smoothness and boundary sharpness, resulting in consistent performance improvements across tasks like semantic segmentation, object detection, depth estimation, and image matting.
We introduce point affiliation into feature upsampling, a notion that describes the affiliation of each upsampled point to a semantic cluster formed by local decoder feature points with semantic similarity. By rethinking point affiliation, we present a generic formulation for generating upsampling kernels. The kernels encourage not only semantic smoothness but also boundary sharpness in the upsampled feature maps. Such properties are particularly useful for some dense prediction tasks such as semantic segmentation. The key idea of our formulation is to generate similarity-aware kernels by comparing the similarity between each encoder feature point and the spatially associated local region of decoder features. In this way, the encoder feature point can function as a cue to inform the semantic cluster of upsampled feature points. To embody the formulation, we further instantiate a lightweight upsampling operator, termed Similarity-Aware Point Affiliation (SAPA), and investigate its variants. SAPA invites consistent performance improvements on a number of dense prediction tasks, including semantic segmentation, object detection, depth estimation, and image matting. Code is available at: https://github.com/poppinace/sapa