CV LGJun 20, 2020

FaceHack: Triggering backdoored facial recognition systems using facial characteristics

Esha Sarkar, Hadjer Benkraouda, Michail Maniatakos

arXiv:2006.11623v114.742 citations

Originality Incremental advance

AI Analysis

This work addresses security vulnerabilities in critical applications like biometric validation, but it is incremental as it builds on known backdoor attack methods by adapting triggers to facial features.

The paper tackles the problem of backdoor attacks in facial recognition systems by showing that changes to facial characteristics, such as those from social-media filters or natural muscle movements, can trigger malicious behavior without being detected by state-of-the-art defenses, while maintaining model performance.

Recent advances in Machine Learning (ML) have opened up new avenues for its extensive use in real-world applications. Facial recognition, specifically, is used from simple friend suggestions in social-media platforms to critical security applications for biometric validation in automated immigration at airports. Considering these scenarios, security vulnerabilities to such ML algorithms pose serious threats with severe outcomes. Recent work demonstrated that Deep Neural Networks (DNNs), typically used in facial recognition systems, are susceptible to backdoor attacks; in other words,the DNNs turn malicious in the presence of a unique trigger. Adhering to common characteristics for being unnoticeable, an ideal trigger is small, localized, and typically not a part of the main im-age. Therefore, detection mechanisms have focused on detecting these distinct trigger-based outliers statistically or through their reconstruction. In this work, we demonstrate that specific changes to facial characteristics may also be used to trigger malicious behavior in an ML model. The changes in the facial attributes maybe embedded artificially using social-media filters or introduced naturally using movements in facial muscles. By construction, our triggers are large, adaptive to the input, and spread over the entire image. We evaluate the success of the attack and validate that it does not interfere with the performance criteria of the model. We also substantiate the undetectability of our triggers by exhaustively testing them with state-of-the-art defenses.

View on arXiv PDF

Similar