OC LGJun 13, 2022

Value Function Based Difference-of-Convex Algorithm for Bilevel Hyperparameter Selection Problems

Lucy Gao, Jane J. Ye, Haian Yin, Shangzhi Zeng, Jin Zhang

arXiv:2206.05976v117.236 citationsh-index: 16Has Code

Originality Incremental advance

AI Analysis

This addresses hyperparameter selection for machine learning practitioners by providing a more broadly applicable optimization method, though it is incremental as it builds on existing bilevel programming approaches.

The paper tackles the problem of hyperparameter tuning in machine learning when theoretical convergence conditions are not satisfied, proposing a Value Function based Difference-of-Convex Algorithm (VF-iDCA) that achieves stationary solutions without strong convexity and smoothness assumptions, with experiments showing superior performance.

Gradient-based optimization methods for hyperparameter tuning guarantee theoretical convergence to stationary solutions when for fixed upper-level variable values, the lower level of the bilevel program is strongly convex (LLSC) and smooth (LLS). This condition is not satisfied for bilevel programs arising from tuning hyperparameters in many machine learning algorithms. In this work, we develop a sequentially convergent Value Function based Difference-of-Convex Algorithm with inexactness (VF-iDCA). We show that this algorithm achieves stationary solutions without LLSC and LLS assumptions for bilevel programs from a broad class of hyperparameter tuning applications. Our extensive experiments confirm our theoretical findings and show that the proposed VF-iDCA yields superior performance when applied to tune hyperparameters.

View on arXiv PDF Code

Similar