Automatic differentiation of nonsmooth iterative algorithms
This work provides foundational insights for differentiable programming in machine learning, particularly for optimization problems involving nonsmooth functions, though it is incremental in extending existing smooth AD theory.
The paper addresses the lack of theoretical understanding for automatic differentiation (AD) in nonsmooth iterative algorithms, characterizing the attractor set of such iterations as a set-valued fixed point under conservative derivatives, which leads to almost everywhere convergence of classical derivatives.
Differentiation along algorithms, i.e., piggyback propagation of derivatives, is now routinely used to differentiate iterative solvers in differentiable programming. Asymptotics is well understood for many smooth problems but the nondifferentiable case is hardly considered. Is there a limiting object for nonsmooth piggyback automatic differentiation (AD)? Does it have any variational meaning and can it be used effectively in machine learning? Is there a connection with classical derivative? All these questions are addressed under appropriate nonexpansivity conditions in the framework of conservative derivatives which has proved useful in understanding nonsmooth AD. For nonsmooth piggyback iterations, we characterize the attractor set of nonsmooth piggyback iterations as a set-valued fixed point which remains in the conservative framework. This has various consequences and in particular almost everywhere convergence of classical derivatives. Our results are illustrated on parametric convex optimization problems with forward-backward, Douglas-Rachford and Alternating Direction of Multiplier algorithms as well as the Heavy-Ball method.