CVAIJul 31, 2025

The Impact of Image Resolution on Face Detection: A Comparative Analysis of MTCNN, YOLOv XI and YOLOv XII models

arXiv:2507.23341v1h-index: 12
Originality Synthesis-oriented
AI Analysis

This work addresses resolution challenges in face detection for applications like surveillance and biometric authentication, but it is incremental as it compares existing methods on new data without introducing novel techniques.

The study tackled the problem of how input resolution affects face detection performance by comparing YOLOv11, YOLOv12, and MTCNN models, finding that YOLOv11 outperforms others in accuracy at higher resolutions with specific metrics like mAP50-95, while YOLOv12 shows better recall.

Face detection is a crucial component in many AI-driven applications such as surveillance, biometric authentication, and human-computer interaction. However, real-world conditions like low-resolution imagery present significant challenges that degrade detection performance. In this study, we systematically investigate the impact of input resolution on the accuracy and robustness of three prominent deep learning-based face detectors: YOLOv11, YOLOv12, and MTCNN. Using the WIDER FACE dataset, we conduct extensive evaluations across multiple image resolutions (160x160, 320x320, and 640x640) and assess each model's performance using metrics such as precision, recall, mAP50, mAP50-95, and inference time. Results indicate that YOLOv11 outperforms YOLOv12 and MTCNN in terms of detection accuracy, especially at higher resolutions, while YOLOv12 exhibits slightly better recall. MTCNN, although competitive in landmark localization, lags in real-time inference speed. Our findings provide actionable insights for selecting resolution-aware face detection models suitable for varying operational constraints.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes