CRAICLJan 30

FraudShield: Knowledge Graph Empowered Defense for LLMs against Fraud Attacks

arXiv:2601.22485v13 citationsh-index: 16
Originality Incremental advance
AI Analysis

This addresses the issue of fraud attacks on LLMs in applications like contract review and job processing, offering an incremental improvement with better interpretability and generalizability.

The paper tackles the problem of LLMs being vulnerable to fraudulent information in critical workflows by introducing FraudShield, a framework that uses a fraud tactic-keyword knowledge graph to enhance security, achieving consistent outperformance over state-of-the-art defenses across multiple LLMs and fraud types.

Large language models (LLMs) have been widely integrated into critical automated workflows, including contract review and job application processes. However, LLMs are susceptible to manipulation by fraudulent information, which can lead to harmful outcomes. Although advanced defense methods have been developed to address this issue, they often exhibit limitations in effectiveness, interpretability, and generalizability, particularly when applied to LLM-based applications. To address these challenges, we introduce FraudShield, a novel framework designed to protect LLMs from fraudulent content by leveraging a comprehensive analysis of fraud tactics. Specifically, FraudShield constructs and refines a fraud tactic-keyword knowledge graph to capture high-confidence associations between suspicious text and fraud techniques. The structured knowledge graph augments the original input by highlighting keywords and providing supporting evidence, guiding the LLM toward more secure responses. Extensive experiments show that FraudShield consistently outperforms state-of-the-art defenses across four mainstream LLMs and five representative fraud types, while also offering interpretable clues for the model's generations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes