CLAIJun 16, 2024

KGPA: Robustness Evaluation for Large Language Models via Cross-Domain Knowledge Graphs

arXiv:2406.10802v12 citations
Originality Incremental advance
AI Analysis

This addresses the need for more systematic and cost-effective robustness evaluation in LLMs, particularly for professional domains, though it is incremental as it builds on existing adversarial attack methods.

The paper tackles the problem of evaluating the robustness of large language models (LLMs) under adversarial attacks by proposing a framework that uses knowledge graphs to generate and poison prompts, finding that robustness varies across models and is influenced by professional domains, with GPT-4-turbo ranking highest.

Existing frameworks for assessing robustness of large language models (LLMs) overly depend on specific benchmarks, increasing costs and failing to evaluate performance of LLMs in professional domains due to dataset limitations. This paper proposes a framework that systematically evaluates the robustness of LLMs under adversarial attack scenarios by leveraging knowledge graphs (KGs). Our framework generates original prompts from the triplets of knowledge graphs and creates adversarial prompts by poisoning, assessing the robustness of LLMs through the results of these adversarial attacks. We systematically evaluate the effectiveness of this framework and its modules. Experiments show that adversarial robustness of the ChatGPT family ranks as GPT-4-turbo > GPT-4o > GPT-3.5-turbo, and the robustness of large language models is influenced by the professional domains in which they operate.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes