OCLGDec 14, 2020

On the Treatment of Optimization Problems with L1 Penalty Terms via Multiobjective Continuation

arXiv:2012.07483v218 citations
AI Analysis

This work provides a more complete understanding of sparsity for researchers and practitioners working with L1-regularized optimization problems, particularly in fields like machine learning and signal processing, by addressing the limitations of traditional weighting methods for non-convex problems.

This paper introduces a novel algorithm to analyze the effects of sparsity in linear and nonlinear optimization problems by directly solving the multiobjective optimization problem (MOP) that minimizes the main objective and the L1-norm simultaneously. This approach provides a more comprehensive understanding of sparsity compared to traditional weighted penalty methods, especially for non-convex nonlinear objectives where weighting methods may fail to find all optimal compromises.

We present a novel algorithm that allows us to gain detailed insight into the effects of sparsity in linear and nonlinear optimization, which is of great importance in many scientific areas such as image and signal processing, medical imaging, compressed sensing, and machine learning (e.g., for the training of neural networks). Sparsity is an important feature to ensure robustness against noisy data, but also to find models that are interpretable and easy to analyze due to the small number of relevant terms. It is common practice to enforce sparsity by adding the $\ell_1$-norm as a weighted penalty term. In order to gain a better understanding and to allow for an informed model selection, we directly solve the corresponding multiobjective optimization problem (MOP) that arises when we minimize the main objective and the $\ell_1$-norm simultaneously. As this MOP is in general non-convex for nonlinear objectives, the weighting method will fail to provide all optimal compromises. To avoid this issue, we present a continuation method which is specifically tailored to MOPs with two objective functions one of which is the $\ell_1$-norm. Our method can be seen as a generalization of well-known homotopy methods for linear regression problems to the nonlinear case. Several numerical examples - including neural network training - demonstrate our theoretical findings and the additional insight that can be gained by this multiobjective approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes