Bayesian Active Learning for Semantic Segmentation
This work addresses the problem of reducing annotation costs for semantic segmentation in computer vision, offering a domain-specific incremental improvement.
The paper tackles the high cost of pixel-level labeling for semantic segmentation by proposing a Bayesian active learning framework using Balanced Entropy (BalEnt) for uncertainty measurement, achieving supervised-level mIoU with only a fraction of labeled pixels and outperforming previous state-of-the-art methods by a large margin.
Fully supervised training of semantic segmentation models is costly and challenging because each pixel within an image needs to be labeled. Therefore, the sparse pixel-level annotation methods have been introduced to train models with a subset of pixels within each image. We introduce a Bayesian active learning framework based on sparse pixel-level annotation that utilizes a pixel-level Bayesian uncertainty measure based on Balanced Entropy (BalEnt) [84]. BalEnt captures the information between the models' predicted marginalized probability distribution and the pixel labels. BalEnt has linear scalability with a closed analytical form and can be calculated independently per pixel without relational computations with other pixels. We train our proposed active learning framework for Cityscapes, Camvid, ADE20K and VOC2012 benchmark datasets and show that it reaches supervised levels of mIoU using only a fraction of labeled pixels while outperforming the previous state-of-the-art active learning models with a large margin.