CLAIJun 28, 2024

AnomaLLMy -- Detecting anomalous tokens in black-box LLMs through low-confidence single-token predictions

arXiv:2406.19840v11 citations
Originality Incremental advance
AI Analysis

This addresses the issue of anomalous tokens for developers and researchers working on LLM robustness and tokenizer assessment, though it is incremental as it builds on existing anomaly detection methods.

The paper tackled the problem of anomalous tokens degrading the quality and reliability of black-box LLMs by introducing AnomaLLMy, which uses low-confidence single-token predictions to detect irregularities, resulting in the identification of 413 major and 65 minor anomalies on the cl100k_base dataset with a cost of $24.39 in API credits.

This paper introduces AnomaLLMy, a novel technique for the automatic detection of anomalous tokens in black-box Large Language Models (LLMs) with API-only access. Utilizing low-confidence single-token predictions as a cost-effective indicator, AnomaLLMy identifies irregularities in model behavior, addressing the issue of anomalous tokens degrading the quality and reliability of models. Validated on the cl100k_base dataset, the token set of GPT-4, AnomaLLMy detected 413 major and 65 minor anomalies, demonstrating the method's efficiency with just \$24.39 spent in API credits. The insights from this research are expected to be beneficial for enhancing the robustness of and accuracy of LLMs, particularly in the development and assessment of tokenizers.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes