On the Effectiveness of Vision Transformers for Zero-shot Face Anti-Spoofing
This work addresses security-critical applications by improving generalization to unseen attacks, though it is incremental as it applies an existing model to a new task.
The paper tackles the problem of face recognition systems being vulnerable to presentation attacks by proposing a vision transformer-based approach for zero-shot face anti-spoofing, achieving state-of-the-art performance on HQ-WMCA and SiW-M datasets with significant cross-database improvements.
The vulnerability of face recognition systems to presentation attacks has limited their application in security-critical scenarios. Automatic methods of detecting such malicious attempts are essential for the safe use of facial recognition technology. Although various methods have been suggested for detecting such attacks, most of them over-fit the training set and fail in generalizing to unseen attacks and environments. In this work, we use transfer learning from the vision transformer model for the zero-shot anti-spoofing task. The effectiveness of the proposed approach is demonstrated through experiments in publicly available datasets. The proposed approach outperforms the state-of-the-art methods in the zero-shot protocols in the HQ-WMCA and SiW-M datasets by a large margin. Besides, the model achieves a significant boost in cross-database performance as well.