LGSep 29, 2025

Efficient Hyperparameter Tuning via Trajectory Invariance Principle

arXiv:2509.25049v11 citationsh-index: 31
Originality Incremental advance
AI Analysis

This work addresses efficient hyperparameter tuning for machine learning practitioners, offering incremental improvements by refining scaling laws and challenging existing viewpoints.

The paper tackles the problem of costly hyperparameter tuning at scale by identifying a trajectory invariance phenomenon that reduces the tuning space from two dimensions to one, enabling an efficient tuning rule.

As hyperparameter tuning becomes increasingly costly at scale, efficient tuning methods are essential. Yet principles for guiding hyperparameter tuning remain limited. In this work, we seek to establish such principles by considering a broad range of hyperparameters, including batch size, learning rate, and weight decay. We identify a phenomenon we call trajectory invariance, where pre-training loss curves, gradient noise, and gradient norm exhibit invariance--closely overlapping--with respect to a quantity that combines learning rate and weight decay. This phenomenon effectively reduces the original two-dimensional hyperparameter space to one dimension, yielding an efficient tuning rule: follow the salient direction revealed by trajectory invariance. Furthermore, we refine previous scaling laws and challenge several existing viewpoints. Overall, our work proposes new principles for efficient tuning and inspires future research on scaling laws.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes