OCJun 29, 2016
A new convergence analysis and perturbation resilience of some accelerated proximal forward-backward algorithms with errorsDaniel Reem, Alvaro De Pierro
Many problems in science and engineering involve, as part of their solution process, the consideration of a separable function which is the sum of two convex functions, one of them possibly non-smooth. Recently a few works have discussed inexact versions of several accelerated proximal methods aiming at solving this minimization problem. This paper shows that inexact versions of a method of Beck and Teboulle (FISTA) preserve, in a Hilbert space setting, the same (non-asymptotic) rate of convergence under some assumptions on the decay rate of the error terms. The notion of inexactness discussed here seems to be rather simple, but, interestingly, when comparing to related works, closely related decay rates of the errors terms yield closely related convergence rates. The derivation sheds some light on the somewhat mysterious origin of some parameters which appear in various accelerated methods. A consequence of the analysis is that the accelerated method is perturbation resilient, making it suitable, in principle, for the superiorization methodology. By taking this into account, we re-examine the superiorization methodology and significantly extend its scope.
OCApr 19, 2018
A telescoping Bregmanian proximal gradient method without the global Lipschitz continuity assumptionDaniel Reem, Simeon Reich, Alvaro De Pierro
The problem of minimization of the sum of two convex functions has various theoretical and real-world applications. One of the popular methods for solving this problem is the proximal gradient method (proximal forward-backward algorithm). A very common assumption in the use of this method is that the gradient of the smooth term is globally Lipschitz continuous. However, this assumption is not always satisfied in practice, thus casting a limitation on the method. In this paper, we discuss, in a wide class of finite and infinite-dimensional spaces, a new variant of the proximal gradient method which does not impose the above-mentioned global Lipschitz continuity assumption. A key contribution of the method is the dependence of the iterative steps on a certain telescopic decomposition of the constraint set into subsets. Moreover, we use a Bregman divergence in the proximal forward-backward operation. Under certain practical conditions, a non-asymptotic rate of convergence (that is, in the function values) is established, as well as the weak convergence of the whole sequence to a minimizer. We also obtain a few auxiliary results of independent interest.
OCMar 1, 2018
Re-examination of Bregman functions and new properties of their divergencesDaniel Reem, Simeon Reich, Alvaro De Pierro
The Bregman divergence (Bregman distance, Bregman measure of distance) is a certain useful substitute for a distance, obtained from a well-chosen function (the "Bregman function"). Bregman functions and divergences have been extensively investigated during the last decades and have found applications in optimization, operations research, information theory, nonlinear analysis, machine learning and more. This paper re-examines various aspects related to the theory of Bregman functions and divergences. In particular, it presents many sufficient conditions which allow the construction of Bregman functions in a general setting and introduces new Bregman functions (such as a negative iterated log entropy). Moreover, it sheds new light on several known Bregman functions such as quadratic entropies, the negative Havrda-Charvát-Tsallis entropy, and the negative Boltzmann-Gibbs-Shannon entropy, and it shows that the negative Burg entropy, which is not a Bregman function according to the classical theory but nevertheless is known to have "Bregmanian properties", can, by our re-examination of the theory, be considered as a Bregman function. Our analysis yields several by-products of independent interest such as the introduction of the concept of relative uniform convexity (a certain generalization of uniform convexity), new properties of uniformly and strongly convex functions, and results in Banach space theory.