Jin Wen

h-index44
2papers

2 Papers

SEApr 14, 2024
Evaluation and Improvement of Fault Detection for Large Language Models

Qiang Hu, Jin Wen, Maxime Cordy et al.

Large language models (LLMs) have recently achieved significant success across various application domains, garnering substantial attention from different communities. Unfortunately, even for the best LLM, many \textit{faults} still exist that LLM cannot properly predict. Such faults will harm the usability of LLMs in general and could introduce safety issues in reliability-critical systems such as autonomous driving systems. How to quickly reveal these faults in real-world datasets that LLM could face is important, but challenging. The major reason is that the ground truth is necessary but the data labeling process is heavy considering the time and human effort. To handle this problem, in the conventional deep learning testing field, test selection methods have been proposed for efficiently evaluating deep learning models by prioritizing faults. However, despite their importance, the usefulness of these methods on LLMs is unclear, and lack of exploration. In this paper, we conduct the first empirical study to investigate the effectiveness of existing fault detection methods for LLMs. Experimental results on four different tasks~(including both code tasks and natural language processing tasks) and four LLMs~(e.g., LLaMA3 and GPT4) demonstrated that simple methods such as Margin perform well on LLMs but there is still a big room for improvement. Based on the study, we further propose \textbf{MuCS}, a prompt \textbf{Mu}tation-based prediction \textbf{C}onfidence \textbf{S}moothing framework to boost the fault detection capability of existing methods. Concretely, multiple prompt mutation techniques have been proposed to help collect more diverse outputs for confidence smoothing. The results show that our proposed framework significantly enhances existing methods with the improvement of test relative coverage by up to 70.53\%.

NASep 30, 2015
An inexact Picard iteration method for absolute value equation

Shu-Xin Miao, Xiang-Tuan Xiong, Jin Wen

Recently, a class of inexact Picard iteration method for solving the absolute value equation: $Ax-|x~|=b$ have been proposed in [Optim Lett 8:2191-2202,2014]. To further improve the performance of Picard iteration method, a new inexact Picard iteration method is proposed to solve the absolute value equation. The sufficient conditions for the convergence of the proposed method for the absolute value equation is given. Some numerical experiments are given to demonstrate the effectiveness of the new method.