CLDec 10, 2025
Targeting Misalignment: A Conflict-Aware Framework for Reward-Model-based LLM AlignmentZixuan Liu, Siavash H. Khajavi, Guangkai Jiang et al.
Reward-model-based fine-tuning is a central paradigm in aligning Large Language Models with human preferences. However, such approaches critically rely on the assumption that proxy reward models accurately reflect intended supervision, a condition often violated due to annotation noise, bias, or limited coverage. This misalignment can lead to undesirable behaviors, where models optimize for flawed signals rather than true human values. In this paper, we investigate a novel framework to identify and mitigate such misalignment by treating the fine-tuning process as a form of knowledge integration. We focus on detecting instances of proxy-policy conflicts, cases where the base model strongly disagrees with the proxy. We argue that such conflicts often signify areas of shared ignorance, where neither the policy nor the reward model possesses sufficient knowledge, making them especially susceptible to misalignment. To this end, we propose two complementary metrics for identifying these conflicts: a localized Proxy-Policy Alignment Conflict Score (PACS) and a global Kendall-Tau Distance measure. Building on this insight, we design an algorithm named Selective Human-in-the-loop Feedback via Conflict-Aware Sampling (SHF-CAS) that targets high-conflict QA pairs for additional feedback, refining both the reward model and policy efficiently. Experiments on two alignment tasks demonstrate that our approach enhances general alignment performance, even when trained with a biased proxy reward. Our work provides a new lens for interpreting alignment failures and offers a principled pathway for targeted refinement in LLM training.
MLNov 10, 2023
A statistical perspective on algorithm unrolling models for inverse problemsYves Atchade, Xinru Liu, Qiuyun Zhu
We consider inverse problems where the conditional distribution of the observation ${\bf y}$ given the latent variable of interest ${\bf x}$ (also known as the forward model) is known, and we have access to a data set in which multiple instances of ${\bf x}$ and ${\bf y}$ are both observed. In this context, algorithm unrolling has become a very popular approach for designing state-of-the-art deep neural network architectures that effectively exploit the forward model. We analyze the statistical complexity of the gradient descent network (GDN), an algorithm unrolling architecture driven by proximal gradient descent. We show that the unrolling depth needed for the optimal statistical performance of GDNs is of order $\log(n)/\log(\varrho_n^{-1})$, where $n$ is the sample size, and $\varrho_n$ is the convergence rate of the corresponding gradient descent algorithm. We also show that when the negative log-density of the latent variable ${\bf x}$ has a simple proximal operator, then a GDN unrolled at depth $D'$ can solve the inverse problem at the parametric rate $O(D'/\sqrt{n})$. Our results thus also suggest that algorithm unrolling models are prone to overfitting as the unrolling depth $D'$ increases. We provide several examples to illustrate these results.
CVFeb 6, 2024
Deep Frequency-Aware Functional Maps for Robust Shape MatchingFeifan Luo, Qinsong Li, Ling Hu et al.
Deep functional map frameworks are widely employed for 3D shape matching. However, most existing deep functional map methods cannot adaptively capture important frequency information for functional map estimation in specific matching scenarios, i.e., lacking \textit{frequency awareness}, resulting in poor performance when dealing with large deformable shape matching. To this end, we propose a novel unsupervised learning-based framework called Deep Frequency-Aware Functional Maps, which can gracefully cope with various shape matching scenarios. We first introduce a general constraint called Spectral Filter Operator Preservation to compute desirable functional maps, where the spectral filter operator encodes informative frequency information and can promote frequency awareness for deep functional map frameworks by learning a set of filter functions. Then, we directly utilize the proposed constraint as a loss function to supervise functional maps, pointwise maps, and filter functions simultaneously, where the filter functions are derived from the orthonormal Jacobi basis, and the coefficients of the basis are learnable parameters. Finally, we develop an effective refinement strategy to improve the final pointwise map, which incorporates our constraint and learned filter functions, leading to more robust and accurate correspondences during the inference process. Extensive experimental results on various datasets demonstrate that our approach outperforms the existing state-of-the-art methods, especially in challenging settings like datasets with non-isometric deformation and inconsistent topology.
CVJul 26, 2018
Learning to predict crisp boundariesRuoxi Deng, Chunhua Shen, Shengjun Liu et al.
Recent methods for boundary or edge detection built on Deep Convolutional Neural Networks (CNNs) typically suffer from the issue of predicted edges being thick and need post-processing to obtain crisp boundaries. Highly imbalanced categories of boundary versus background in training data is one of main reasons for the above problem. In this work, the aim is to make CNNs produce sharp boundaries without post-processing. We introduce a novel loss for boundary detection, which is very effective for classifying imbalanced data and allows CNNs to produce crisp boundaries. Moreover, we propose an end-to-end network which adopts the bottom-up/top-down architecture to tackle the task. The proposed network effectively leverages hierarchical features and produces pixel-accurate boundary mask, which is critical to reconstruct the edge map. Our experiments illustrate that directly making crisp prediction not only promotes the visual results of CNNs, but also achieves better results against the state-of-the-art on the BSDS500 dataset (ODS F-score of .815) and the NYU Depth dataset (ODS F-score of .762).