CVMay 30, 2025

Leveraging Intermediate Features of Vision Transformer for Face Anti-Spoofing

arXiv:2505.24402v22 citationsh-index: 212025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
Originality Incremental advance
AI Analysis

This addresses security vulnerabilities in face authentication systems, though it is incremental as it builds on existing ViT approaches.

The paper tackles the problem of detecting spoofing attacks in face recognition systems by proposing a method based on Vision Transformer (ViT) that leverages intermediate features and data augmentation, achieving improved accuracy on datasets like OULU-NPU and SiW.

Face recognition systems are designed to be robust against changes in head pose, illumination, and blurring during image capture. If a malicious person presents a face photo of the registered user, they may bypass the authentication process illegally. Such spoofing attacks need to be detected before face recognition. In this paper, we propose a spoofing attack detection method based on Vision Transformer (ViT) to detect minute differences between live and spoofed face images. The proposed method utilizes the intermediate features of ViT, which have a good balance between local and global features that are important for spoofing attack detection, for calculating loss in training and score in inference. The proposed method also introduces two data augmentation methods: face anti-spoofing data augmentation and patch-wise data augmentation, to improve the accuracy of spoofing attack detection. We demonstrate the effectiveness of the proposed method through experiments using the OULU-NPU and SiW datasets. The project page is available at: https://gsisaoki.github.io/FAS-ViT-CVPRW/ .

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes