MLFeb 9
A Statistical Framework for Alignment with Biased AI FeedbackXintao Xia, Zhiqiu Xia, Linjun Zhang et al.
Modern alignment pipelines are increasingly replacing expensive human preference labels with evaluations from large language models (LLM-as-Judge). However, AI labels can be systematically biased compared to high-quality human feedback datasets. In this paper, we develop two debiased alignment methods within a general framework that accommodates heterogeneous prompt-response distributions and external human feedback sources. Debiased Direct Preference Optimization (DDPO) augments standard DPO with a residual-based correction and density-ratio reweighting to mitigate systematic bias, while retaining DPO's computational efficiency. Debiased Identity Preference Optimization (DIPO) directly estimates human preference probabilities without imposing a parametric reward model. We provide theoretical guarantees for both methods: DDPO offers a practical and computationally efficient solution for large-scale alignment, whereas DIPO serves as a robust, statistically optimal alternative that attains the semiparametric efficiency bound. Empirical studies on sentiment generation, summarization, and single-turn dialogue demonstrate that the proposed methods substantially improve alignment efficiency and recover performance close to that of an oracle trained on fully human-labeled data.
CLFeb 28, 2025
A Survey of Uncertainty Estimation Methods on Large Language ModelsZhiqiu Xia, Jinxuan Xu, Yuqian Zhang et al.
Large language models (LLMs) have demonstrated remarkable capabilities across various tasks. However, these models could offer biased, hallucinated, or non-factual responses camouflaged by their fluency and realistic appearance. Uncertainty estimation is the key method to address this challenge. While research efforts in uncertainty estimation are ramping up, there is a lack of comprehensive and dedicated surveys on LLM uncertainty estimation. This survey presents four major avenues of LLM uncertainty estimation. Furthermore, we perform extensive experimental evaluations across multiple methods and datasets. At last, we provide critical and promising future directions for LLM uncertainty estimation.
CVApr 24
Breaking Watermarks in the Frequency Domain: A Modulated Diffusion Attack FrameworkChunpeng Wang, Binyan Qu, Xiaoyu Wang et al.
Digital image watermarking has advanced rapidly for copyright protection of generative AI, yet the comparatively limited progress in watermark attack techniques has broken the attack-defense balance and hindered further advances in the field. In this paper, we propose FMDiffWA, a frequency-domain modulated diffusion framework for watermark attacks. Specifically, we introduce a frequency-domain watermark modulation (FWM) module and incorporate it into the sampling stages both the forward and reverse diffusion processes. This mechanism enables selective modulation of watermark-related frequency components, thereby allowing FMDiffWA to effectively neutralize the invisible watermark signals while preserving the perceptual quality of the attacked watermarked images. To achieve a better trade-off between attack efficacy and visual fidelity, we reformulate the training strategy of conventional diffusion models by augmenting the canonical noise estimation objective with an auxiliary refinement constraint. Comprehensive experiments demonstrate that FMDiffWA achieves superior visual fidelity compared to existing watermark attacks, while exhibiting strong generalization across diverse watermarking schemes.
DLApr 11, 2025
Analyzing 16,193 LLM Papers for Fun and ProfitsZhiqiu Xia, Lang Zhu, Bingzhe Li et al.
Large Language Models (LLMs) are reshaping the landscape of computer science research, driving significant shifts in research priorities across diverse conferences and fields. This study provides a comprehensive analysis of the publication trend of LLM-related papers in 77 top-tier computer science conferences over the past six years (2019-2024). We approach this analysis from four distinct perspectives: (1) We investigate how LLM research is driving topic shifts within major conferences. (2) We adopt a topic modeling approach to identify various areas of LLM-related topic growth and reveal the topics of concern at different conferences. (3) We explore distinct contribution patterns of academic and industrial institutions. (4) We study the influence of national origins on LLM development trajectories. Synthesizing the findings from these diverse analytical angles, we derive ten key insights that illuminate the dynamics and evolution of the LLM research ecosystem.
CVMay 6, 2024
Elevator, Escalator, or Neither? Classifying Conveyor State Using Smartphone under Arbitrary Pedestrian BehaviorTianlang He, Zhiqiu Xia, S. -H. Gary Chan
Knowing a pedestrian's conveyor state of ''elevator,'' ''escalator,'' or ''neither'' is fundamental to many applications such as indoor navigation and people flow management. Previous studies on classifying the conveyor state often rely on specially designed body-worn sensors or make strong assumptions on pedestrian behaviors, which greatly strangles their deployability. To overcome this, we study the classification problem under arbitrary pedestrian behaviors using the inertial navigation system (INS) of the commonly available smartphones (including accelerometer, gyroscope, and magnetometer). This problem is challenging, because the INS signals of the conveyor states are entangled by the arbitrary and diverse pedestrian behaviors. We propose ELESON, a novel and lightweight deep-learning approach that uses phone INS to classify a pedestrian to elevator, escalator, or neither. Using causal decomposition and adversarial learning, ELESON extracts the motion and magnetic features of conveyor state independent of pedestrian behavior, based on which it estimates the state confidence by means of an evidential classifier. We curate a large and diverse dataset with 36,420 instances of pedestrians randomly taking elevators and escalators under arbitrary unknown behaviors. Our extensive experiments show that ELESON is robust against pedestrian behavior, achieving a high accuracy of over 0.9 in F1 score, strong confidence discriminability of 0.81 in AUROC (Area Under the Receiver Operating Characteristics), and low computational and memory requirements fit for common smartphone deployment.