LGOCMLJul 7, 2023

Smoothing the Edges: Smooth Optimization for Sparse Regularization using Hadamard Overparametrization

arXiv:2307.03571v39 citationsh-index: 15
Originality Incremental advance
AI Analysis

This provides a general, differentiable method for sparse regularization in machine learning, applicable across domains like high-dimensional regression and neural networks, though it is incremental in improving optimization techniques.

The paper tackles the problem of optimizing non-smooth, sparsity-regularized objectives by introducing a smooth optimization framework using Hadamard overparametrization, enabling gradient descent compatibility and proving equivalence in global and local minima without spurious solutions.

We present a framework for smooth optimization of explicitly regularized objectives for (structured) sparsity. These non-smooth and possibly non-convex problems typically rely on solvers tailored to specific models and regularizers. In contrast, our method enables fully differentiable and approximation-free optimization and is thus compatible with the ubiquitous gradient descent paradigm in deep learning. The proposed optimization transfer comprises an overparameterization of selected parameters and a change of penalties. In the overparametrized problem, smooth surrogate regularization induces non-smooth, sparse regularization in the base parametrization. We prove that the surrogate objective is equivalent in the sense that it not only has identical global minima but also matching local minima, thereby avoiding the introduction of spurious solutions. Additionally, our theory establishes results of independent interest regarding matching local minima for arbitrary, potentially unregularized, objectives. We comprehensively review sparsity-inducing parametrizations across different fields that are covered by our general theory, extend their scope, and propose improvements in several aspects. Numerical experiments further demonstrate the correctness and effectiveness of our approach on several sparse learning problems ranging from high-dimensional regression to sparse neural network training.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes