EfficientSRFace: An Efficient Network with Super-Resolution Enhancement for Accurate Face Detection
This work addresses the challenge of detecting small, low-resolution faces in dense scenarios, which is crucial for applications like surveillance and crowd analysis, but it is incremental as it builds on existing deep face detectors.
The paper tackled the problem of low-resolution face detection in crowded scenes by introducing a feature-level super-resolution module, achieving competitive performance on FDDB and WIDER Face datasets with minimal additional parameters and computational overhead.
In face detection, low-resolution faces, such as numerous small faces of a human group in a crowded scene, are common in dense face prediction tasks. They usually contain limited visual clues and make small faces less distinguishable from the other small objects, which poses great challenge to accurate face detection. Although deep convolutional neural network has significantly promoted the research on face detection recently, current deep face detectors rarely take into account low-resolution faces and are still vulnerable to the real-world scenarios where massive amount of low-resolution faces exist. Consequently, they usually achieve degraded performance for low-resolution face detection. In order to alleviate this problem, we develop an efficient detector termed EfficientSRFace by introducing a feature-level super-resolution reconstruction network for enhancing the feature representation capability of the model. This module plays an auxiliary role in the training process, and can be removed during the inference without increasing the inference time. Extensive experiments on public benchmarking datasets, such as FDDB and WIDER Face, show that the embedded image super-resolution module can significantly improve the detection accuracy at the cost of a small amount of additional parameters and computational overhead, while helping our model achieve competitive performance compared with the state-of-the-arts methods.