AIApr 18

Understanding and Enforcing Weight Disentanglement in Task Arithmetic

arXiv:2604.1707859.9h-index: 7Has Code
AI Analysis

For researchers and practitioners using task arithmetic for model editing, this work provides a theoretical foundation and a practical method to enhance performance, though it is an incremental improvement over existing techniques.

The paper introduces Task-Feature Specialization (TFS) as the fundamental principle behind weight disentanglement in task arithmetic, proving that TFS leads to both disentanglement and weight vector orthogonality. They propose OrthoReg, a regularization method that enforces orthogonality on weight updates during fine-tuning, consistently improving task arithmetic performance across various methods.

Task arithmetic provides an efficient, training-free way to edit pre-trained models, yet lacks a fundamental theoretical explanation for its success. The existing concept of ``weight disentanglement" describes the ideal outcome of non-interfering task composition but does not reveal its underlying cause. Crucially, what intrinsic properties of the pre-trained model ($θ_0$) or the task vectors ($τ_t$) enable this disentanglement remains underexplored. In this paper, we introduce Task-Feature Specialization (TFS), a model's ability to allocate distinct internal features to different tasks, as the fundamental principle. We first prove that TFS is a sufficient condition for weight disentanglement. More importantly, we find that TFS also gives rise to an observable geometric consequence: weight vector orthogonality. This positions TFS as the common cause for both the desired functional outcome (disentanglement) and a measurable geometric property (orthogonality). This relationship provides the key insight for our method: since the abstract TFS property is intractable to enforce directly, we can instead promote weight disentanglement by shaping its concrete geometric consequence, orthogonality. Therefore, we propose OrthoReg, a simple and effective regularization method that actively enforces an internal orthogonal structure on weight updates ($ΔW$) that constitute $τ_t$ during fine-tuning. And we theoretically prove that OrthoReg promotes disentanglement. Extensive experiments demonstrate that OrthoReg consistently and significantly enhances the performance of various task arithmetic methods. Code is available at \href{https://github.com/RL-MIND/OrthoReg}{https://github.com/RL-MIND/OrthoReg}.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes