LGJan 15

Unit-Consistent (UC) Adjoint for GSD and Backprop in Deep Learning Applications

arXiv:2601.10873v1

Originality Incremental advance

AI Analysis

This addresses optimization inefficiencies in deep learning for researchers and practitioners, but it is incremental as it builds on prior rescaling-invariant schemes.

The paper tackles the problem of gradient descent not being equivariant to gauge symmetries in deep neural networks, which causes optimization to depend on arbitrary parameterizations, and introduces a Unit-Consistent adjoint to derive gauge-consistent steepest descent and backpropagation.

Deep neural networks constructed from linear maps and positively homogeneous nonlinearities (e.g., ReLU) possess a fundamental gauge symmetry: the network function is invariant to node-wise diagonal rescalings. However, standard gradient descent is not equivariant to this symmetry, causing optimization trajectories to depend heavily on arbitrary parameterizations. Prior work has proposed rescaling-invariant optimization schemes for positively homogeneous networks (e.g., path-based or path-space updates). Our contribution is complementary: we formulate the invariance requirement at the level of the backward adjoint/optimization geometry, which provides a simple, operator-level recipe that can be applied uniformly across network components and optimizer state. By replacing the Euclidean transpose with a Unit-Consistent (UC) adjoint, we derive UC gauge-consistent steepest descent and backprogation.

View on arXiv PDF

Similar