Yue Song

h-index25

5papers

81citations

Novelty56%

AI Score53

Ranked #11,719 of 194,257 authors (top 6%)#3,036 in LG (top 8%)

5 Papers

3.8LGNov 23, 2023Code

RankFeat&RankWeight: Rank-1 Feature/Weight Removal for Out-of-distribution Detection

Yue Song, Wei Wang, Nicu Sebe

The task of out-of-distribution (OOD) detection is crucial for deploying machine learning models in real-world settings. In this paper, we observe that the singular value distributions of the in-distribution (ID) and OOD features are quite different: the OOD feature matrix tends to have a larger dominant singular value than the ID feature, and the class predictions of OOD samples are largely determined by it. This observation motivates us to propose \texttt{RankFeat}, a simple yet effective \emph{post hoc} approach for OOD detection by removing the rank-1 matrix composed of the largest singular value and the associated singular vectors from the high-level feature. \texttt{RankFeat} achieves \emph{state-of-the-art} performance and reduces the average false positive rate (FPR95) by 17.90\% compared with the previous best method. The success of \texttt{RankFeat} motivates us to investigate whether a similar phenomenon would exist in the parameter matrices of neural networks. We thus propose \texttt{RankWeight} which removes the rank-1 weight from the parameter matrices of a single deep layer. Our \texttt{RankWeight}is also \emph{post hoc} and only requires computing the rank-1 matrix once. As a standalone approach, \texttt{RankWeight} has very competitive performance against other methods across various backbones. Moreover, \texttt{RankWeight} enjoys flexible compatibility with a wide range of OOD detection methods. The combination of \texttt{RankWeight} and \texttt{RankFeat} refreshes the new \emph{state-of-the-art} performance, achieving the FPR95 as low as 16.13\% on the ImageNet-1k benchmark. Extensive ablation studies and comprehensive theoretical analyses are presented to support the empirical results. Code is publicly available via \url{https://github.com/KingJamesSong/RankFeat}.

5.1DGJul 2, 2024Code

Fast and Stable Riemannian Metrics on SPD Manifolds via Cholesky Product Geometry

Ziheng Chen, Yue Song, Xiao-Jun Wu et al.

Recent advances in Symmetric Positive Definite (SPD) matrix learning show that Riemannian metrics are fundamental to effective SPD neural networks. Motivated by this, we revisit the geometry of the Cholesky factors and uncover a simple product structure that enables convenient metric design. Building on this insight, we propose two fast and stable SPD metrics, Power--Cholesky Metric (PCM) and Bures--Wasserstein--Cholesky Metric (BWCM), derived via Cholesky decomposition. Compared with existing SPD metrics, the proposed metrics provide closed-form operators, computational efficiency, and improved numerical stability. We further apply our metrics to construct Riemannian Multinomial Logistic Regression (MLR) classifiers and residual blocks for SPD neural networks. Experiments on SPD deep learning, numerical stability analyses, and tensor interpolation demonstrate the effectiveness, efficiency, and robustness of our metrics. The code is available at https://github.com/GitZH-Chen/PCM_BWCM.

9.4SYMay 24

Consensus Tracking of Perturbed Open Multi-Agent Systems with Repelling Antagonistic Interactions

Mengqi Xue, Yuchao Xiong, Yue Song

An open multi-agent system (OMAS) features migrating agents which produce a flexible network that is naturally switching and size-varying. Meanwhile, agent migrations also make an OMAS prone to environmental adversities. In this work, we investigate the consensus tracking problem of OMASs suffering migration-induced adversities, including non-vanishing agent dynamics/state perturbations and repelling antagonistic interactions among agents, over an intermittently disconnected signed digraph. The OMAS is interpreted into a perturbed multi-mode multi-dimensional ($M^3D$) system in which unstable subsystems are created when repelling interactions dominate the cooperative ones in the network regardless of its connectivity. To handle the destabilizing effect brought by repelling interactions and non-vanishing perturbations, we extend the stability theory for $M^3D$ systems and apply it to the OMAS to show that ultimately bounded consensus tracking can be achieved if the network switching satisfies the piecewise average dwell time and activation time ratio conditions. Particularly, for vanishing perturbations, asymptotic tracking can be ensured under weaker switching conditions.

14.2LGMar 17, 2024Code

A Lie Group Approach to Riemannian Batch Normalization

Ziheng Chen, Yue Song, Yunmei Liu et al.

Manifold-valued measurements exist in numerous applications within computer vision and machine learning. Recent studies have extended Deep Neural Networks (DNNs) to manifolds, and concomitantly, normalization techniques have also been adapted to several manifolds, referred to as Riemannian normalization. Nonetheless, most of the existing Riemannian normalization methods have been derived in an ad hoc manner and only apply to specific manifolds. This paper establishes a unified framework for Riemannian Batch Normalization (RBN) techniques on Lie groups. Our framework offers the theoretical guarantee of controlling both the Riemannian mean and variance. Empirically, we focus on Symmetric Positive Definite (SPD) manifolds, which possess three distinct types of Lie group structures. Using the deformation concept, we generalize the existing Lie groups on SPD manifolds into three families of parameterized Lie groups. Specific normalization layers induced by these Lie groups are then proposed for SPD neural networks. We demonstrate the effectiveness of our approach through three sets of experiments: radar recognition, human action recognition, and electroencephalography (EEG) classification. The code is available at https://github.com/GitZH-Chen/LieBN.git.

28.1AIJun 10, 2024Code

Aligning Large Language Models with Representation Editing: A Control Perspective

Lingkai Kong, Haorui Wang, Wenhao Mu et al.

Aligning large language models (LLMs) with human objectives is crucial for real-world applications. However, fine-tuning LLMs for alignment often suffers from unstable training and requires substantial computing resources. Test-time alignment techniques, such as prompting and guided decoding, do not modify the underlying model, and their performance remains dependent on the original model's capabilities. To address these challenges, we propose aligning LLMs through representation editing. The core of our method is to view a pre-trained autoregressive LLM as a discrete-time stochastic dynamical system. To achieve alignment for specific objectives, we introduce external control signals into the state space of this language dynamical system. We train a value function directly on the hidden states according to the Bellman equation, enabling gradient-based optimization to obtain the optimal control signals at test time. Our experiments demonstrate that our method outperforms existing test-time alignment techniques while requiring significantly fewer resources compared to fine-tuning methods. Our code is available at https://github.com/Lingkai-Kong/RE-Control.