CVDec 22, 2025

Steering Vision-Language Pre-trained Models for Incremental Face Presentation Attack Detection

Haoze Li, Jie Zhang, Guoying Zhao, Stephen Lin, Shiguang Shan

arXiv:2512.19022v1h-index: 20

Originality Incremental advance

AI Analysis

This addresses the need for privacy-compliant, robust lifelong deployment in face spoofing detection, though it is incremental as it builds on existing vision-language pre-trained models.

The paper tackles the problem of incremental learning for face presentation attack detection without retaining past data, proposing SVLP-IL, which reduces catastrophic forgetting and enhances performance on unseen domains, as shown in experiments across multiple benchmarks.

Face Presentation Attack Detection (PAD) demands incremental learning (IL) to combat evolving spoofing tactics and domains. Privacy regulations, however, forbid retaining past data, necessitating rehearsal-free IL (RF-IL). Vision-Language Pre-trained (VLP) models, with their prompt-tunable cross-modal representations, enable efficient adaptation to new spoofing styles and domains. Capitalizing on this strength, we propose \textbf{SVLP-IL}, a VLP-based RF-IL framework that balances stability and plasticity via \textit{Multi-Aspect Prompting} (MAP) and \textit{Selective Elastic Weight Consolidation} (SEWC). MAP isolates domain dependencies, enhances distribution-shift sensitivity, and mitigates forgetting by jointly exploiting universal and domain-specific cues. SEWC selectively preserves critical weights from previous tasks, retaining essential knowledge while allowing flexibility for new adaptations. Comprehensive experiments across multiple PAD benchmarks show that SVLP-IL significantly reduces catastrophic forgetting and enhances performance on unseen domains. SVLP-IL offers a privacy-compliant, practical solution for robust lifelong PAD deployment in RF-IL settings.

View on arXiv PDF

Similar