Frobenius-Type Norms and Inner Products of Matrices and Linear Maps with Applications to Neural Network Training
This work provides a theoretical extension that could improve optimization in neural network training, but it appears incremental as it builds on existing norm concepts without introducing a new paradigm.
The paper generalizes the Frobenius norm and inner product for matrices and linear maps, showing that the classical version is a special case within a broader family, and demonstrates that this additional flexibility can be used to precondition neural network training.
The Frobenius norm is a frequent choice of norm for matrices. In particular, the underlying Frobenius inner product is typically used to evaluate the gradient of an objective with respect to matrix variable, such as those occuring in the training of neural networks. We provide a broader view on the Frobenius norm and inner product for linear maps or matrices, and establish their dependence on inner products in the domain and co-domain spaces. This shows that the classical Frobenius norm is merely one special element of a family of more general Frobenius-type norms. The significant extra freedom furnished by this realization can be used, among other things, to precondition neural network training.