Extracting deep local features to detect manipulated images of human faces
This addresses the issue of protecting against malicious use of realistic manipulated face images, which is an incremental improvement in detection methods.
The paper tackles the problem of detecting manipulated face images by proposing that local image features shared across manipulated regions are key, and it introduces a lightweight architecture with a multitask training scheme that achieves state-of-the-art results on the FaceForensics++ dataset with reduced parameters.
Recent developments in computer vision and machine learning have made it possible to create realistic manipulated videos of human faces, raising the issue of ensuring adequate protection against the malevolent effects unlocked by such capabilities. In this paper we propose local image features that are shared across manipulated regions are the key element for the automatic detection of manipulated face images. We also design a lightweight architecture with the correct structural biases for extracting such features and derive a multitask training scheme that consistently outperforms image class supervision alone. The trained networks achieve state-of-the-art results in the FaceForensics++ dataset using significantly reduced number of parameters and are shown to work well in detecting fully generated face images.