LG MLSep 16, 2025

Reversible Deep Equilibrium Models

Sam McCallum, Kamran Arora, James Foster

arXiv:2509.12917v113.06 citationsh-index: 2

Originality Highly original

AI Analysis

This addresses a key bottleneck in implicit models for researchers and practitioners in machine learning, offering a more stable and efficient alternative to existing DEQs.

The paper tackled the problem of approximate gradient calculation in Deep Equilibrium Models (DEQs), which causes unstable training and high computational cost, by introducing Reversible Deep Equilibrium Models (RevDEQs) that enable exact gradient calculation, eliminate the need for regularization, and reduce function evaluations, achieving state-of-the-art performance on language modeling and image classification tasks.

Deep Equilibrium Models (DEQs) are an interesting class of implicit model where the model output is implicitly defined as the fixed point of a learned function. These models have been shown to outperform explicit (fixed-depth) models in large-scale tasks by trading many deep layers for a single layer that is iterated many times. However, gradient calculation through DEQs is approximate. This often leads to unstable training dynamics and requires regularisation or many function evaluations to fix. Here, we introduce Reversible Deep Equilibrium Models (RevDEQs) that allow for exact gradient calculation, no regularisation and far fewer function evaluations than DEQs. We show that RevDEQs achieve state-of-the-art performance on language modelling and image classification tasks against comparable implicit and explicit models.

View on arXiv PDF

Similar