Quantitative Metrics for Evaluating Explanations of Video DeepFake Detectors
This addresses the need for better explanation tools in content moderation, though it is incremental as it focuses on metrics rather than new detection methods.
The paper tackles the problem of evaluating explanations for video DeepFake detectors by proposing a set of metrics to assess visual quality and informativeness from a human-centric perspective, comparing common approaches on DFDC and DFD datasets.
The proliferation of DeepFake technology is a rising challenge in today's society, owing to more powerful and accessible generation methods. To counter this, the research community has developed detectors of ever-increasing accuracy. However, the ability to explain the decisions of such models to users is lacking behind and is considered an accessory in large-scale benchmarks, despite being a crucial requirement for the correct deployment of automated tools for content moderation. We attribute the issue to the reliance on qualitative comparisons and the lack of established metrics. We describe a simple set of metrics to evaluate the visual quality and informativeness of explanations of video DeepFake classifiers from a human-centric perspective. With these metrics, we compare common approaches to improve explanation quality and discuss their effect on both classification and explanation performance on the recent DFDC and DFD datasets.