LGOct 30, 2024

Rethinking Deep Thinking: Stable Learning of Algorithms using Lipschitz Constraints

Jay Bear, Adam Prügel-Bennett, Jonathon Hare

arXiv:2410.23451v113.49 citationsh-index: 3NIPS

Originality Incremental advance

AI Analysis

This addresses the problem of unstable training and unreliable solutions in iterative algorithm learning for researchers in machine learning and optimization, though it is incremental as it builds on existing Deep Thinking methods.

The paper tackled the instability and lack of convergence guarantees in Deep Thinking networks for learning iterative algorithms, resulting in DT-L models with fewer parameters and reliable solutions, including convergence guarantees and robust extrapolation to harder problems like the traveling salesperson problem.

Iterative algorithms solve problems by taking steps until a solution is reached. Models in the form of Deep Thinking (DT) networks have been demonstrated to learn iterative algorithms in a way that can scale to different sized problems at inference time using recurrent computation and convolutions. However, they are often unstable during training, and have no guarantees of convergence/termination at the solution. This paper addresses the problem of instability by analyzing the growth in intermediate representations, allowing us to build models (referred to as Deep Thinking with Lipschitz Constraints (DT-L)) with many fewer parameters and providing more reliable solutions. Additionally our DT-L formulation provides guarantees of convergence of the learned iterative procedure to a unique solution at inference time. We demonstrate DT-L is capable of robustly learning algorithms which extrapolate to harder problems than in the training set. We benchmark on the traveling salesperson problem to evaluate the capabilities of the modified system in an NP-hard problem where DT fails to learn.

View on arXiv PDF

Similar