Cauchy-Schwarz Regularizers
This work addresses the need for adaptable regularization in optimization problems, particularly for machine learning and large-scale applications, though it appears incremental as it builds on existing regularization concepts.
The authors tackled the problem of designing versatile regularization functions for optimization by introducing Cauchy-Schwarz regularizers, which promote properties like discrete-valued vectors and orthogonal matrices, and demonstrated efficacy in applications such as solving underdetermined linear systems and neural network weight quantization.
We introduce a novel class of regularization functions, called Cauchy-Schwarz (CS) regularizers, which can be designed to induce a wide range of properties in solution vectors of optimization problems. To demonstrate the versatility of CS regularizers, we derive regularization functions that promote discrete-valued vectors, eigenvectors of a given matrix, and orthogonal matrices. The resulting CS regularizers are simple, differentiable, and can be free of spurious stationary points, making them suitable for gradient-based solvers and large-scale optimization problems. In addition, CS regularizers automatically adapt to the appropriate scale, which is, for example, beneficial when discretizing the weights of neural networks. To demonstrate the efficacy of CS regularizers, we provide results for solving underdetermined systems of linear equations and weight quantization in neural networks. Furthermore, we discuss specializations, variations, and generalizations, which lead to an even broader class of new and possibly more powerful regularizers.