CAP: Data Contamination Detection via Consistency Amplification
This addresses reliability issues in LLM evaluations for researchers and practitioners, though it is incremental as it builds on existing contamination detection methods.
The paper tackles the problem of data contamination in large language model evaluations by proposing CAP, a framework that uses consistency amplification to detect dataset leakage, and validates it on seven LLMs and four benchmarks, showing composite benchmarks are especially prone to contamination.
Large language models (LLMs) are widely used, but concerns about data contamination challenge the reliability of LLM evaluations. Existing contamination detection methods are often task-specific or require extra prerequisites, limiting practicality. We propose a novel framework, Consistency Amplification-based Data Contamination Detection (CAP), which introduces the Performance Consistency Ratio (PCR) to measure dataset leakage by leveraging LM consistency. To the best of our knowledge, this is the first method to explicitly differentiate between fine-tuning and contamination, which is crucial for detecting contamination in domain-specific models. Additionally, CAP is applicable to various benchmarks and works for both white-box and black-box models. We validate CAP's effectiveness through experiments on seven LLMs and four domain-specific benchmarks. Our findings also show that composite benchmarks from various dataset sources are particularly prone to unintentional contamination. Codes will be publicly available soon.