Breaking the Loop: Detecting and Mitigating Denial-of-Service Vulnerabilities in Large Language Models
This addresses latency vulnerabilities in LLMs used across various domains, but it is incremental as it builds on existing reliability improvements.
The paper tackled the problem of recurrent generation causing increased latency and Denial-of-Service vulnerabilities in Large Language Models, proposing methods that achieved 95.24% accuracy in detection.
Large Language Models (LLMs) have significantly advanced text understanding and generation, becoming integral to applications across education, software development, healthcare, entertainment, and legal services. Despite considerable progress in improving model reliability, latency remains under-explored, particularly through recurrent generation, where models repeatedly produce similar or identical outputs, causing increased latency and potential Denial-of-Service (DoS) vulnerabilities. We propose RecurrentGenerator, a black-box evolutionary algorithm that efficiently identifies recurrent generation scenarios in prominent LLMs like LLama-3 and GPT-4o. Additionally, we introduce RecurrentDetector, a lightweight real-time classifier trained on activation patterns, achieving 95.24% accuracy and an F1 score of 0.87 in detecting recurrent loops. Our methods provide practical solutions to mitigate latency-related vulnerabilities, and we publicly share our tools and data to support further research.