CVAug 30, 2021

BioFors: A Large Biomedical Image Forensics Dataset

Ekraam Sabir, Soumyaroop Nandi, Wael AbdAlmageed, Prem Natarajan

arXiv:2108.12961v17.315 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This addresses the problem of image manipulation in biomedical research for the academic and forensic communities, but it is incremental as it primarily provides a new dataset.

The authors tackled the lack of benchmark datasets for biomedical image forensics by introducing BioFors, a dataset of 47,805 images from 1,031 research papers, and showed that existing algorithms perform poorly on it, highlighting the need for specialized research.

Research in media forensics has gained traction to combat the spread of misinformation. However, most of this research has been directed towards content generated on social media. Biomedical image forensics is a related problem, where manipulation or misuse of images reported in biomedical research documents is of serious concern. The problem has failed to gain momentum beyond an academic discussion due to an absence of benchmark datasets and standardized tasks. In this paper we present BioFors -- the first dataset for benchmarking common biomedical image manipulations. BioFors comprises 47,805 images extracted from 1,031 open-source research papers. Images in BioFors are divided into four categories -- Microscopy, Blot/Gel, FACS and Macroscopy. We also propose three tasks for forensic analysis -- external duplication detection, internal duplication detection and cut/sharp-transition detection. We benchmark BioFors on all tasks with suitable state-of-the-art algorithms. Our results and analysis show that existing algorithms developed on common computer vision datasets are not robust when applied to biomedical images, validating that more research is required to address the unique challenges of biomedical image forensics.

View on arXiv PDF Code

Similar