LG OCNov 7, 2024

Solving Hidden Monotone Variational Inequalities with Surrogate Losses

Ryan D'Orazio, Danilo Vucetic, Zichu Liu, Junhyung Lyle Kim, Ioannis Mitliagkas, Gauthier Gidel

arXiv:2411.05228v37.92 citationsh-index: 31ICLR

Originality Incremental advance

AI Analysis

This addresses practical challenges in deep learning for non-loss minimization problems, offering a unifying method for domains such as reinforcement learning, though it appears incremental in building on existing VI techniques.

The paper tackles the challenge of solving variational inequality (VI) problems, which arise in applications like min-max optimization and minimizing projected Bellman error, by proposing a surrogate-based approach that guarantees convergence under practical assumptions and is compatible with deep learning optimizers like ADAM.

Deep learning has proven to be effective in a wide variety of loss minimization problems. However, many applications of interest, like minimizing projected Bellman error and min-max optimization, cannot be modelled as minimizing a scalar loss function but instead correspond to solving a variational inequality (VI) problem. This difference in setting has caused many practical challenges as naive gradient-based approaches from supervised learning tend to diverge and cycle in the VI case. In this work, we propose a principled surrogate-based approach compatible with deep learning to solve VIs. We show that our surrogate-based approach has three main benefits: (1) under assumptions that are realistic in practice (when hidden monotone structure is present, interpolation, and sufficient optimization of the surrogates), it guarantees convergence, (2) it provides a unifying perspective of existing methods, and (3) is amenable to existing deep learning optimizers like ADAM. Experimentally, we demonstrate our surrogate-based approach is effective in min-max optimization and minimizing projected Bellman error. Furthermore, in the deep reinforcement learning case, we propose a novel variant of TD(0) which is more compute and sample efficient.

View on arXiv PDF

Similar