A Rapid Test for Accuracy and Bias of Face Recognition Technology
This work addresses the need for efficient and accessible testing of face recognition technology to promote responsible use, though it is incremental in improving existing benchmarking methods.
The paper tackles the problem of costly and slow accuracy measurement for face recognition systems by proposing a novel method that benchmarks them quickly without manual annotation, using approximate labels and model embeddings, and it introduces a public benchmark revealing demographic biases, such as lower accuracy for Asian women.
Measuring the accuracy of face recognition (FR) systems is essential for improving performance and ensuring responsible use. Accuracy is typically estimated using large annotated datasets, which are costly and difficult to obtain. We propose a novel method for 1:1 face verification that benchmarks FR systems quickly and without manual annotation, starting from approximate labels (e.g., from web search results). Unlike previous methods for training set label cleaning, ours leverages the embedding representation of the models being evaluated, achieving high accuracy in smaller-sized test datasets. Our approach reliably estimates FR accuracy and ranking, significantly reducing the time and cost of manual labeling. We also introduce the first public benchmark of five FR cloud services, revealing demographic biases, particularly lower accuracy for Asian women. Our rapid test method can democratize FR testing, promoting scrutiny and responsible use of the technology. Our method is provided as a publicly accessible tool at https://github.com/caltechvisionlab/frt-rapid-test