CVOct 18, 2024

On the Influence of Shape, Texture and Color for Learning Semantic Segmentation

arXiv:2410.14878v23 citationsh-index: 21ECAI
Originality Incremental advance
AI Analysis

This work addresses the problem of understanding cue biases in deep learning for semantic segmentation, which is incremental as it extends prior research on image classification to segmentation tasks.

The study investigated how shape, texture, and color cues individually and in combination influence learning in semantic segmentation, finding that no single cue dominates but shape + color improves prediction of small objects and border pixels, with consistent cue performance across convolutional and transformer architectures.

Recent research has investigated the shape and texture biases of pre-trained deep neural networks (DNNs) in image classification. Those works test how much a trained DNN relies on specific image cues like texture. The present study shifts the focus to understanding the cue influence during training, analyzing what DNNs can learn from shape, texture, and color cues in absence of the others; investigating their individual and combined influence on the learning success. We analyze these cue influences at multiple levels by decomposing datasets into cue-specific versions. Addressing semantic segmentation, we learn the given task from these reduced cue datasets, creating cue experts. Early fusion of cues is performed by constructing appropriate datasets. This is complemented by a late fusion of experts which allows us to study cue influence location-dependent on pixel level. Experiments on Cityscapes, PASCAL Context, and a synthetic CARLA dataset show that while no single cue dominates, the shape + color expert predominantly improves the prediction of small objects and border pixels. The cue performance order is consistent for the tested convolutional and transformer architecture, indicating similar cue extraction capabilities, although pre-trained transformers are said to be more biased towards shape than convolutional neural networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes