Texture Underfitting for Domain Adaptation
This work addresses the challenge of generalizing semantic segmentation models across domains, such as from simulated to real data, which is crucial for applications like autonomous driving, but it is incremental as it builds on existing domain adaptation techniques.
The paper tackles the problem of domain adaptation for semantic segmentation by addressing neural networks' tendency to overfit to texture rather than learning structural information, using random image stylization and a training procedure to promote texture underfitting, resulting in improved performance over conventional methods in synthetic-to-real adaptation experiments.
Comprehensive semantic segmentation is one of the key components for robust scene understanding and a requirement to enable autonomous driving. Driven by large scale datasets, convolutional neural networks show impressive results on this task. However, a segmentation algorithm generalizing to various scenes and conditions would require an enormously diverse dataset, making the labour intensive data acquisition and labeling process prohibitively expensive. Under the assumption of structural similarities between segmentation maps, domain adaptation promises to resolve this challenge by transferring knowledge from existing, potentially simulated datasets to new environments where no supervision exists. While the performance of this approach is contingent on the concept that neural networks learn a high level understanding of scene structure, recent work suggests that neural networks are biased towards overfitting to texture instead of learning structural and shape information. Considering the ideas underlying semantic segmentation, we employ random image stylization to augment the training dataset and propose a training procedure that facilitates texture underfitting to improve the performance of domain adaptation. In experiments with supervised as well as unsupervised methods for the task of synthetic-to-real domain adaptation, we show that our approach outperforms conventional training methods.