Curvature Tuning: Provable Training-free Model Steering From a Single Parameter
This addresses the need for more interpretable and parameter-efficient finetuning methods in AI, offering a novel approach that complements existing techniques.
The paper tackles the problem of finetuning pretrained models by proposing Curvature Tuning, a method that adjusts activation functions with a single hyperparameter to steer decision boundaries, resulting in improved accuracy and robustness, such as boosting ResNet-50/152 accuracy by up to 8.46% over linear probing and increasing robust accuracy by over 1000% on benchmarks.
The scaling of model and data sizes has reshaped the AI landscape, establishing finetuning pretrained models as the standard paradigm for solving downstream tasks. However, dominant finetuning methods typically rely on weight adaptation, often lack interpretability, and depend on heuristically chosen hyperparameters. In this paper, we take a different perspective and shift the focus from weights to activation functions, viewing them through the lens of spline operators. We propose Curvature Tuning (CT), an interpretable and principled steering method that modulates a model's decision boundary by injecting a single hyperparameter into its activation functions. We show that CT provably adjusts model decision boundary curvature and, more fundamentally, projects a model onto a space of smooth functions-thereby complementing current finetuning methods, whose effect lies primarily in feature adaptation. Making this hyperparameter trainable gives rise to a novel and highly parameter-efficient finetuning method. Empirically, CT improves both generalization and robustness. For example, it boosts downstream accuracy of ResNet-50/152 by 7.14%/8.46% over linear probing and 4.64%/1.70% over LoRA across 12 datasets, and improves robust accuracy on the $\ell_\infty$ benchmark from RobustBench by 1032.64%/1494.46%. Our code is available at https://github.com/Leon-Leyang/curvature-tuning.