AI LGMar 22

Explainable AML Triage with LLMs: Evidence Retrieval and Counterfactual Checks

arXiv:2604.1975548.5

Predicted impact top 74% in AI · last 90 daysOriginality Incremental advance

AI Analysis

This addresses the need for practical, compliant decision support in AML triage for financial investigators, though it appears incremental by building on existing LLM and RAG techniques.

The paper tackles the problem of anti-money laundering (AML) transaction monitoring, where large volumes of alerts must be triaged under strict audit constraints, by proposing an explainable framework that combines evidence retrieval, structured LLM outputs, and counterfactual checks. The results show improved auditability and performance, with metrics such as PR-AUC of 0.75 and citation validity of 0.98.

Anti-money laundering (AML) transaction monitoring generates large volumes of alerts that must be rapidly triaged by investigators under strict audit and governance constraints. While large language models (LLMs) can summarize heterogeneous evidence and draft rationales, unconstrained generation is risky in regulated workflows due to hallucinations, weak provenance, and explanations that are not faithful to the underlying decision. We propose an explainable AML triage framework that treats triage as an evidence-constrained decision process. Our method combines (i) retrieval-augmented evidence bundling from policy/typology guidance, customer context, alert triggers, and transaction subgraphs, (ii) a structured LLM output contract that requires explicit citations and separates supporting from contradicting or missing evidence, and (iii) counterfactual checks that validate whether minimal, plausible perturbations lead to coherent changes in both the triage recommendation and its rationale. We evaluate on public synthetic AML benchmarks and simulators and compare against rules, tabular and graph machine-learning baselines, and LLM-only/RAG-only variants. Results show that evidence grounding substantially improves auditability and reduces numerical and policy hallucination errors, while counterfactual validation further increases decision-linked explainability and robustness, yielding the best overall triage performance (PR-AUC 0.75; Escalate F1 0.62) and strong provenance and faithfulness metrics (citation validity 0.98; evidence support 0.88; counterfactual faithfulness 0.76). These findings indicate that governed, verifiable LLM systems can provide practical decision support for AML triage without sacrificing compliance requirements for traceability and defensibility.

View on arXiv PDF

Similar