An Analytical Workflow for Clustering Forensic Images
This is an incremental improvement for forensic researchers needing to organize image datasets.
The paper tackles the problem of curating large collections of forensic images by presenting an unsupervised clustering workflow that uses deep features and domain data, achieving a purity of 89% in manual evaluation.
Large collections of images, if curated, drastically contribute to the quality of research in many domains. Unsupervised clustering is an intuitive, yet effective step towards curating such datasets. In this work, we present a workflow for unsupervisedly clustering a large collection of forensic images. The workflow utilizes classic clustering on deep feature representation of the images in addition to domain-related data to group them together. Our manual evaluation shows a purity of 89\% for the resulted clusters.