Efficient Temporally-Aware DeepFake Detection using H.264 Motion Vectors
This work addresses the need for real-time temporally-aware DeepFake detection in video calls and streaming, but it is incremental as it builds on existing temporal analysis ideas with a more efficient method.
The paper tackled the problem of detecting DeepFake videos by addressing the limitation of ignoring temporal inconsistencies in per-frame methods, and the result was an effective approach using H.264 Motion Vectors and Information Masks that achieved minimal computational costs compared to RGB-only methods.
Video DeepFakes are fake media created with Deep Learning (DL) that manipulate a person's expression or identity. Most current DeepFake detection methods analyze each frame independently, ignoring inconsistencies and unnatural movements between frames. Some newer methods employ optical flow models to capture this temporal aspect, but they are computationally expensive. In contrast, we propose using the related but often ignored Motion Vectors (MVs) and Information Masks (IMs) from the H.264 video codec, to detect temporal inconsistencies in DeepFakes. Our experiments show that this approach is effective and has minimal computational costs, compared with per-frame RGB-only methods. This could lead to new, real-time temporally-aware DeepFake detection methods for video calls and streaming.