CVAug 10, 2023

TrainFors: A Large Benchmark Training Dataset for Image Manipulation Detection and Localization

Soumyaroop Nandi, Prem Natarajan, Wael Abd-Almageed

arXiv:2308.05264v16.87 citationsh-index: 41

Originality Synthesis-oriented

AI Analysis

This addresses the problem of unfair comparisons in IMDL research for the community by providing a consistent training benchmark, though it is incremental as it standardizes existing practices.

The authors tackled the lack of a standardized training dataset for image manipulation detection and localization by proposing TrainFors, a benchmark dataset for various forgery types, and they trained state-of-the-art methods on it to report performance under similar conditions.

The evaluation datasets and metrics for image manipulation detection and localization (IMDL) research have been standardized. But the training dataset for such a task is still nonstandard. Previous researchers have used unconventional and deviating datasets to train neural networks for detecting image forgeries and localizing pixel maps of manipulated regions. For a fair comparison, the training set, test set, and evaluation metrics should be persistent. Hence, comparing the existing methods may not seem fair as the results depend heavily on the training datasets as well as the model architecture. Moreover, none of the previous works release the synthetic training dataset used for the IMDL task. We propose a standardized benchmark training dataset for image splicing, copy-move forgery, removal forgery, and image enhancement forgery. Furthermore, we identify the problems with the existing IMDL datasets and propose the required modifications. We also train the state-of-the-art IMDL methods on our proposed TrainFors1 dataset for a fair evaluation and report the actual performance of these methods under similar conditions.

View on arXiv PDF

Similar