Moreau Envelope Based Difference-of-weakly-Convex Reformulation and Algorithm for Bilevel Programs
This work addresses hyperparameter tuning in machine learning, offering a more flexible bilevel optimization approach, though it is incremental as it builds on existing difference-of-convex methods.
The paper tackles bilevel programming for hyperparameter selection by introducing a Moreau envelope-based reformulation that requires convexity only in the lower-level variables, expanding applicability beyond prior methods, and demonstrates effectiveness through numerical experiments on kernel support vector machines with simulated data.
Bilevel programming has emerged as a valuable tool for hyperparameter selection, a central concern in machine learning. In a recent study by Ye et al. (2023), a value function-based difference of convex algorithm was introduced to address bilevel programs. This approach proves particularly powerful when dealing with scenarios where the lower-level problem exhibits convexity in both the upper-level and lower-level variables. Examples of such scenarios include support vector machines and $\ell_1$ and $\ell_2$ regularized regression. In this paper, we significantly expand the range of applications, now requiring convexity only in the lower-level variables of the lower-level program. We present an innovative single-level difference of weakly convex reformulation based on the Moreau envelope of the lower-level problem. We further develop a sequentially convergent Inexact Proximal Difference of Weakly Convex Algorithm (iP-DwCA). To evaluate the effectiveness of the proposed iP-DwCA, we conduct numerical experiments focused on tuning hyperparameters for kernel support vector machines on simulated data.