Line Search for Convex Minimization
This work addresses a gap in optimization algorithms for researchers and practitioners needing efficient line search in convex minimization, though it is incremental as it builds on existing bisection and golden-section methods.
The paper tackles the lack of principled exact line search algorithms for general convex functions by proposing two new algorithms, Δ-Bisection and Δ-Secant, which use convexity to speed up convergence, with experiments showing they are often more than twice as fast as existing quasiconvex methods.
Golden-section search and bisection search are the two main principled algorithms for 1d minimization of quasiconvex (unimodal) functions. The first one only uses function queries, while the second one also uses gradient queries. Other algorithms exist under much stronger assumptions, such as Newton's method. However, to the best of our knowledge, there is no principled exact line search algorithm for general convex functions -- including piecewise-linear and max-compositions of convex functions -- that takes advantage of convexity. We propose two such algorithms: $Δ$-Bisection is a variant of bisection search that uses (sub)gradient information and convexity to speed up convergence, while $Δ$-Secant is a variant of golden-section search and uses only function queries. While bisection search reduces the $x$ interval by a factor 2 at every iteration, $Δ$-Bisection reduces the (sometimes much) smaller $x^*$-gap $Δ^x$ (the $x$ coordinates of $Δ$) by at least a factor 2 at every iteration. Similarly, $Δ$-Secant also reduces the $x^*$-gap by at least a factor 2 every second function query. Moreover, the $y^*$-gap $Δ^y$ (the $y$ coordinates of $Δ$) also provides a refined stopping criterion, which can also be used with other algorithms. Experiments on a few convex functions confirm that our algorithms are always faster than their quasiconvex counterparts, often by more than a factor 2. We further design a quasi-exact line search algorithm based on $Δ$-Secant. It can be used with gradient descent as a replacement for backtracking line search, for which some parameters can be finicky to tune -- and we provide examples to this effect, on strongly-convex and smooth functions. We provide convergence guarantees, and confirm the efficiency of quasi-exact line search on a few single- and multivariate convex functions.