LGSep 29, 2025

Efficient Hyperparameter Tuning via Trajectory Invariance Principle

Bingrui Li, Jiaxin Wen, Zhanpeng Zhou, Jun Zhu, Jianfei Chen

arXiv:2509.25049v17.11 citationsh-index: 31

Originality Incremental advance

AI Analysis

This work addresses efficient hyperparameter tuning for machine learning practitioners, offering incremental improvements by refining scaling laws and challenging existing viewpoints.

The paper tackles the problem of costly hyperparameter tuning at scale by identifying a trajectory invariance phenomenon that reduces the tuning space from two dimensions to one, enabling an efficient tuning rule.

As hyperparameter tuning becomes increasingly costly at scale, efficient tuning methods are essential. Yet principles for guiding hyperparameter tuning remain limited. In this work, we seek to establish such principles by considering a broad range of hyperparameters, including batch size, learning rate, and weight decay. We identify a phenomenon we call trajectory invariance, where pre-training loss curves, gradient noise, and gradient norm exhibit invariance--closely overlapping--with respect to a quantity that combines learning rate and weight decay. This phenomenon effectively reduces the original two-dimensional hyperparameter space to one dimension, yielding an efficient tuning rule: follow the salient direction revealed by trajectory invariance. Furthermore, we refine previous scaling laws and challenge several existing viewpoints. Overall, our work proposes new principles for efficient tuning and inspires future research on scaling laws.

View on arXiv PDF

Similar