LG AIOct 28, 2025

Calibrating and Rotating: A Unified Framework for Weight Conditioning in PEFT

Da Chang, Peng Xue, Yu Li, Yongxiang Liu, Pengxiang Xu, Shixun Zhang

arXiv:2511.00051v211.43 citationsh-index: 2Has Code

Originality Incremental advance

AI Analysis

This work addresses computational bottlenecks in adapting large pre-trained models for researchers and practitioners, offering incremental improvements over existing PEFT methods.

The paper tackles the computational overhead and unclear mechanism of DoRA in parameter-efficient fine-tuning by reformulating it into an efficient matrix form and proposing a unified framework for designing advanced PEFT methods. The result includes two novel methods, Pre-Diag and SORA, which achieve superior performance and efficiency compared to LoRA and DoRA in experiments on natural language tasks.

Parameter-Efficient Fine-Tuning (PEFT) methods are crucial for adapting large pre-trained models. Among these, LoRA is considered a foundational approach. Building on this, the influential DoRA method enhances performance by decomposing weight updates into magnitude and direction. However, its underlying mechanism remains unclear, and it introduces significant computational overhead. In this work, we first identify that DoRA's success stems from its capacity to increase the singular value entropy of the weight update matrix, which promotes a more uniform update distribution akin to full fine-tuning. We then reformulate DoRA into a mathematically equivalent and more efficient matrix form, revealing it as a learnable weight conditioning method. Based on this insight, we propose a unified framework for designing advanced PEFT methods by exploring two orthogonal dimensions: the architectural placement and the transformation type of the conditioning matrix. Within this framework, we introduce two novel methods: (1) \textbf{Pre-Diag}, which applies a diagonal conditioning matrix before the LoRA update to efficiently calibrate the pre-trained weights, thereby enhancing performance while reducing training time; and (2) \textbf{S}kewed \textbf{O}rthogonal \textbf{R}otation \textbf{A}daptation (\textbf{SORA}), which employs a parameter-efficient orthogonal rotation to perform a more powerful, norm-preserving transformation of the feature space. Extensive experiments on natural language understanding and generation tasks demonstrate that our proposed methods achieve superior performance and efficiency compared to both LoRA and DoRA. The code is available at https://github.com/MaeChd/SORA.

View on arXiv PDF Code

Similar