CVOct 8, 2022

Multi-Scale Wavelet Transformer for Face Forgery Detection

arXiv:2210.03899v115 citationsh-index: 49
Originality Incremental advance
AI Analysis

This work addresses face forgery detection, a domain-specific problem, with incremental improvements in feature aggregation.

The paper tackles the problem of limited expressive ability in face forgery detection by proposing a multi-scale wavelet transformer framework, which achieves efficient and effective performance in both within and cross-dataset scenarios.

Currently, many face forgery detection methods aggregate spatial and frequency features to enhance the generalization ability and gain promising performance under the cross-dataset scenario. However, these methods only leverage one level frequency information which limits their expressive ability. To overcome these limitations, we propose a multi-scale wavelet transformer framework for face forgery detection. Specifically, to take full advantage of the multi-scale and multi-frequency wavelet representation, we gradually aggregate the multi-scale wavelet representation at different stages of the backbone network. To better fuse the frequency feature with the spatial features, frequency-based spatial attention is designed to guide the spatial feature extractor to concentrate more on forgery traces. Meanwhile, cross-modality attention is proposed to fuse the frequency features with the spatial features. These two attention modules are calculated through a unified transformer block for efficiency. A wide variety of experiments demonstrate that the proposed method is efficient and effective for both within and cross datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes