CVJun 11

Learning Task-Aware Sampling with Shared Saliency through Density-Equalizing Mappings

arXiv:2606.12869v17.9

Predicted impact top 65% in CV · last 90 daysOriginality Incremental advance

AI Analysis

For medical imaging and surface analysis tasks where informative structures are localized, this framework improves efficiency and interpretability of convolutional neural networks.

The paper proposes a Density-Equalizing Convolutional Neural Network (DECNN) that uses learned density functions to non-uniformly sample convolutional receptive fields, focusing computation on informative regions. Experiments on image classification and craniofacial surface analysis show DECNN achieves competitive or superior performance with fewer parameters and produces interpretable saliency maps.

In image and surface-based learning tasks, convolutional features are typically extracted using receptive fields that are sampled uniformly across the entire domain. However, informative structures are rarely distributed uniformly in practice and are often concentrated in localized regions. Such phenomena are particularly common in medical imaging, where pathological changes are spatially confined. Consequently, uniform convolution allocates equal computational effort to both informative and uninformative regions, resulting in inefficient feature extraction and suboptimal utilization of model capacity. To address this issue, we propose a framework for task-adaptive sampling that dynamically redistributes computational attention according to the spatial importance of the data. Specifically, we introduce the Density-Equalizing Convolutional Neural Network (DECNN), which employs density-equalizing mappings to guide convolution through a learned density function. The density function encodes the relative importance of different regions and induces a transformation that enlarges informative areas while compressing less relevant ones. As a result, convolutional receptive fields are redistributed non-uniformly over the domain, enabling denser sampling in task-relevant regions. By coupling this importance-driven transformation with convolution, DECNN performs adaptive feature extraction that focuses computational resources on informative structures. This leads to more efficient use of model capacity, yielding a lightweight yet expressive architecture while simultaneously producing an interpretable saliency map. Experiments on image classification and craniofacial surface analysis demonstrate that DECNN achieves competitive or superior performance with fewer parameters, accurately identifies task-relevant regions, and remains robust under complex geometric variations.

View on arXiv PDF

Similar