Image Manipulation Detection by Multi-View Multi-Scale Supervision
This work addresses the problem of detecting manipulated images for security and forensics applications, representing an incremental improvement by balancing sensitivity and specificity.
The paper tackles the challenge of image manipulation detection by proposing MVSS-Net, which uses multi-view feature learning and multi-scale supervision to improve both sensitivity to manipulations and specificity to authentic images, achieving strong performance on five benchmark datasets.
The key challenge of image manipulation detection is how to learn generalizable features that are sensitive to manipulations in novel data, whilst specific to prevent false alarms on authentic images. Current research emphasizes the sensitivity, with the specificity overlooked. In this paper we address both aspects by multi-view feature learning and multi-scale supervision. By exploiting noise distribution and boundary artifact surrounding tampered regions, the former aims to learn semantic-agnostic and thus more generalizable features. The latter allows us to learn from authentic images which are nontrivial to be taken into account by current semantic segmentation network based methods. Our thoughts are realized by a new network which we term MVSS-Net. Extensive experiments on five benchmark sets justify the viability of MVSS-Net for both pixel-level and image-level manipulation detection.