CRApr 21
SAGE: Signal-Amplified Guided Embeddings for LLM-based Vulnerability DetectionZhengyang Shan, Xu Qian, Jiayun Xin et al.
Software vulnerabilities are a primary threat to modern infrastructure. While static analysis and Graph Neural Networks have long served as the foundation for vulnerability detection, the emergence of Large Language Models (LLMs) has introduced a transformative paradigm driven by superior semantic reasoning and cross-environment generalization. However, in the context of LLM-based vulnerability detection, we identify a fundamental bottleneck in these models termed \textbf{Signal Submersion}: a state where features related to vulnerability are activated internally but numerically overwhelmed by dominant functional semantics. To address this, we propose \textbf{SAGE} (\textbf{S}ignal-\textbf{A}mplified \textbf{G}uided \textbf{E}mbeddings), a framework that shifts from passive signal submersion to active signal recovery. SAGE integrates task-conditional Sparse Autoencoders (SAEs) to isolate and amplify these faint vulnerability signals. Extensive evaluations on BigVul, PrimeVul, and PreciseBugs demonstrate that SAGE achieves state-of-the-art performance. Notably, SAGE mitigates Signal Submersion by increasing the internal Signal-to-Noise Ratio (SNR) by 12.7$\times$ via sparse manifold projection. This mechanistic intervention enables a 7B model to achieve up to 318\% Matthews Correlation Coefficient (MCC) gains on unseen distributions and a 319\% gain on classic datasets. By maintaining robust performance across 13 programming languages and outperforming 34B baselines, SAGE establishes a more efficient and scalable path to software security than simple parameter scaling.
CVAug 19, 2020
Channel-wise Hessian Aware trace-Weighted Quantization of Neural NetworksXu Qian, Victor Li, Crews Darren
Second-order information has proven to be very effective in determining the redundancy of neural network weights and activations. Recent paper proposes to use Hessian traces of weights and activations for mixed-precision quantization and achieves state-of-the-art results. However, prior works only focus on selecting bits for each layer while the redundancy of different channels within a layer also differ a lot. This is mainly because the complexity of determining bits for each channel is too high for original methods. Here, we introduce Channel-wise Hessian Aware trace-Weighted Quantization (CW-HAWQ). CW-HAWQ uses Hessian trace to determine the relative sensitivity order of different channels of activations and weights. What's more, CW-HAWQ proposes to use deep Reinforcement learning (DRL) Deep Deterministic Policy Gradient (DDPG)-based agent to find the optimal ratios of different quantization bits and assign bits to channels according to the Hessian trace order. The number of states in CW-HAWQ is much smaller compared with traditional AutoML based mix-precision methods since we only need to search ratios for the quantization bits. Compare CW-HAWQ with state-of-the-art shows that we can achieve better results for multiple networks.
LGJan 24, 2019
SAGA with Arbitrary SamplingXu Qian, Zheng Qu, Peter Richtárik
We study the problem of minimizing the average of a very large number of smooth functions, which is of key importance in training supervised learning models. One of the most celebrated methods in this context is the SAGA algorithm. Despite years of research on the topic, a general-purpose version of SAGA---one that would include arbitrary importance sampling and minibatching schemes---does not exist. We remedy this situation and propose a general and flexible variant of SAGA following the {\em arbitrary sampling} paradigm. We perform an iteration complexity analysis of the method, largely possible due to the construction of new stochastic Lyapunov functions. We establish linear convergence rates in the smooth and strongly convex regime, and under a quadratic functional growth condition (i.e., in a regime not assuming strong convexity). Our rates match those of the primal-dual method Quartz for which an arbitrary sampling analysis is available, which makes a significant step towards closing the gap in our understanding of complexity of primal and dual methods for finite sum problems.