CRAIFeb 22

Evaluating the Reliability of Digital Forensic Evidence Discovered by Large Language Model: A Case Study

arXiv:2602.20202v1COMPSAC
Originality Incremental advance
AI Analysis

This addresses reliability concerns for forensic investigators using AI, though it is incremental as it builds on existing methods like knowledge graphs and LLMs.

This paper tackles the problem of unreliable AI-identified digital evidence in forensic investigations by proposing a structured framework that automates artifact extraction, refines data using LLMs, and validates results with a knowledge graph. Evaluated on a 13 GB dataset, it achieved over 95% accuracy in artifact extraction and supported chain-of-custody adherence.

The growing reliance on AI-identified digital evidence raises significant concerns about its reliability, particularly as large language models (LLMs) are increasingly integrated into forensic investigations. This paper proposes a structured framework that automates forensic artifact extraction, refines data through LLM-driven analysis, and validates results using a Digital Forensic Knowledge Graph (DFKG). Evaluated on a 13 GB forensic image dataset containing 61 applications, 2,864 databases, and 5,870 tables, the framework ensures artifact traceability and evidentiary consistency through deterministic Unique Identifiers (UIDs) and forensic cross-referencing. We propose this methodology to address challenges in ensuring the credibility and forensic integrity of AI-identified evidence, reducing classification errors, and advancing scalable, auditable methodologies. A comprehensive case study on this dataset demonstrates the framework's effectiveness, achieving over 95 percent accuracy in artifact extraction, strong support of chain-of-custody adherence, and robust contextual consistency in forensic relationships. Key results validate the framework's ability to enhance reliability, reduce errors, and establish a legally sound paradigm for AI-assisted digital forensics.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes