CR AI CYMay 7, 2025

A Proposal for Evaluating the Operational Risk for ChatBots based on Large Language Models

Pedro Pinacho-Davidson, Fernando Gutierrez, Pablo Zapata, Rodolfo Vergara, Pablo Aqueveque

arXiv:2505.04784v11 citationsh-index: 2Has Code

Originality Incremental advance

AI Analysis

This work addresses operational risk assessment for chatbots, which is crucial for organizations deploying AI-driven conversational systems, though it appears incremental as it builds on existing testing frameworks.

The authors tackled the problem of operational risks in chatbots based on large language models by proposing a novel risk-assessment metric that evaluates threats to organizations, users, and third parties, validated using an enhanced version of the Garak framework in a retrieval-augmented generation scenario to guide mitigation and improvements.

The emergence of Generative AI (Gen AI) and Large Language Models (LLMs) has enabled more advanced chatbots capable of human-like interactions. However, these conversational agents introduce a broader set of operational risks that extend beyond traditional cybersecurity considerations. In this work, we propose a novel, instrumented risk-assessment metric that simultaneously evaluates potential threats to three key stakeholders: the service-providing organization, end users, and third parties. Our approach incorporates the technical complexity required to induce erroneous behaviors in the chatbot--ranging from non-induced failures to advanced prompt-injection attacks--as well as contextual factors such as the target industry, user age range, and vulnerability severity. To validate our metric, we leverage Garak, an open-source framework for LLM vulnerability testing. We further enhance Garak to capture a variety of threat vectors (e.g., misinformation, code hallucinations, social engineering, and malicious code generation). Our methodology is demonstrated in a scenario involving chatbots that employ retrieval-augmented generation (RAG), showing how the aggregated risk scores guide both short-term mitigation and longer-term improvements in model design and deployment. The results underscore the importance of multi-dimensional risk assessments in operationalizing secure, reliable AI-driven conversational systems.

View on arXiv PDF

Similar