Automated Artefact Relevancy Determination from Artefact Metadata and Associated Timeline Events
This system aims to significantly aid law enforcement investigators in the discovery and prioritization of digital evidence, which is an incremental improvement to existing forensic workflows.
This paper addresses the problem of digital forensic evidence backlogs by proposing an automated system for determining file artefact relevancy. It classifies newly discovered files based on previously encountered pertinent files, generating a relevancy score using filesystem metadata and associated timeline events.
Case-hindering, multi-year digital forensic evidence backlogs have become commonplace in law enforcement agencies throughout the world. This is due to an ever-growing number of cases requiring digital forensic investigation coupled with the growing volume of data to be processed per case. Leveraging previously processed digital forensic cases and their component artefact relevancy classifications can facilitate an opportunity for training automated artificial intelligence based evidence processing systems. These can significantly aid investigators in the discovery and prioritisation of evidence. This paper presents one approach for file artefact relevancy determination building on the growing trend towards a centralised, Digital Forensics as a Service (DFaaS) paradigm. This approach enables the use of previously encountered pertinent files to classify newly discovered files in an investigation. Trained models can aid in the detection of these files during the acquisition stage, i.e., during their upload to a DFaaS system. The technique generates a relevancy score for file similarity using each artefact's filesystem metadata and associated timeline events. The approach presented is validated against three experimental usage scenarios.