Can Foundation Models Generalise the Presentation Attack Detection Capabilities on ID Cards?
This work addresses the problem of limited generalization in PAD systems for ID cards, which is incremental as it applies existing FM techniques to a specific domain.
The study tackled the challenge of generalizing presentation attack detection (PAD) across diverse ID card countries by leveraging Foundation Models (FM), finding that bona fide images are crucial for achieving generalization in zero-shot and fine-tuning scenarios.
Nowadays, one of the main challenges in presentation attack detection (PAD) on ID cards is obtaining generalisation capabilities for a diversity of countries that are issuing ID cards. Most PAD systems are trained on one, two, or three ID documents because of privacy protection concerns. As a result, they do not obtain competitive results for commercial purposes when tested in an unknown new ID card country. In this scenario, Foundation Models (FM) trained on huge datasets can help to improve generalisation capabilities. This work intends to improve and benchmark the capabilities of FM and how to use them to adapt the generalisation on PAD of ID Documents. Different test protocols were used, considering zero-shot and fine-tuning and two different ID card datasets. One private dataset based on Chilean IDs and one open-set based on three ID countries: Finland, Spain, and Slovakia. Our findings indicate that bona fide images are the key to generalisation.