Tansheng Zhu

h-index3
2papers

2 Papers

CLNov 10, 2025Code
When Bias Pretends to Be Truth: How Spurious Correlations Undermine Hallucination Detection in LLMs

Shaowen Wang, Yiqi Dong, Ruinian Chang et al.

Despite substantial advances, large language models (LLMs) continue to exhibit hallucinations, generating plausible yet incorrect responses. In this paper, we highlight a critical yet previously underexplored class of hallucinations driven by spurious correlations -- superficial but statistically prominent associations between features (e.g., surnames) and attributes (e.g., nationality) present in the training data. We demonstrate that these spurious correlations induce hallucinations that are confidently generated, immune to model scaling, evade current detection methods, and persist even after refusal fine-tuning. Through systematically controlled synthetic experiments and empirical evaluations on state-of-the-art open-source and proprietary LLMs (including GPT-5), we show that existing hallucination detection methods, such as confidence-based filtering and inner-state probing, fundamentally fail in the presence of spurious correlations. Our theoretical analysis further elucidates why these statistical biases intrinsically undermine confidence-based detection techniques. Our findings thus emphasize the urgent need for new approaches explicitly designed to address hallucinations caused by spurious correlations.

OCFeb 10, 2025
Bayesian Optimization by Kernel Regression and Density-based Exploration

Tansheng Zhu, Hongyu Zhou, Ke Jin et al.

Bayesian optimization is highly effective for optimizing expensive-to-evaluate black-box functions, but it faces significant computational challenges due to the high computational complexity of Gaussian processes, which results in a total time complexity that is quartic with respect to the number of iterations. To address this limitation, we propose the Bayesian Optimization by Kernel regression and density-based Exploration (BOKE) algorithm. BOKE uses kernel regression for efficient function approximation, kernel density for exploration, and integrates them into the confidence bound criteria to guide the optimization process, thus reducing computational costs to quadratic. Our theoretical analysis rigorously establishes the global convergence of BOKE and ensures its robustness in noisy settings. Through extensive numerical experiments on both synthetic and real-world optimization tasks, we demonstrate that BOKE not only performs competitively compared to Gaussian process-based methods and several other baseline methods but also exhibits superior computational efficiency. These results highlight BOKE's effectiveness in resource-constrained environments, providing a practical approach for optimization problems in engineering applications.