SMAE: Few-shot Learning for HDR Deghosting with Saturation-Aware Masked Autoencoders
This work addresses the challenge of few-shot HDR deghosting for computer vision applications, offering a novel semi-supervised method that reduces data dependency, though it is incremental in its approach.
The paper tackles the problem of generating high-quality HDR images from dynamic scenes with limited labeled data by proposing SSHDR, a two-stage semi-supervised approach that first reconstructs saturated regions using a Saturated Mask AutoEncoder and then removes ghosts via iterative learning, achieving state-of-the-art performance across datasets with few labeled samples.
Generating a high-quality High Dynamic Range (HDR) image from dynamic scenes has recently been extensively studied by exploiting Deep Neural Networks (DNNs). Most DNNs-based methods require a large amount of training data with ground truth, requiring tedious and time-consuming work. Few-shot HDR imaging aims to generate satisfactory images with limited data. However, it is difficult for modern DNNs to avoid overfitting when trained on only a few images. In this work, we propose a novel semi-supervised approach to realize few-shot HDR imaging via two stages of training, called SSHDR. Unlikely previous methods, directly recovering content and removing ghosts simultaneously, which is hard to achieve optimum, we first generate content of saturated regions with a self-supervised mechanism and then address ghosts via an iterative semi-supervised learning framework. Concretely, considering that saturated regions can be regarded as masking Low Dynamic Range (LDR) input regions, we design a Saturated Mask AutoEncoder (SMAE) to learn a robust feature representation and reconstruct a non-saturated HDR image. We also propose an adaptive pseudo-label selection strategy to pick high-quality HDR pseudo-labels in the second stage to avoid the effect of mislabeled samples. Experiments demonstrate that SSHDR outperforms state-of-the-art methods quantitatively and qualitatively within and across different datasets, achieving appealing HDR visualization with few labeled samples.