LG OCFeb 23

Understanding the Curse of Unrolling

Sheheryar Mehmood, Florian Knoll, Peter Ochs

arXiv:2602.19733v11.4h-index: 37

Originality Incremental advance

AI Analysis

This addresses a specific bottleneck in hyperparameter optimization and meta-learning for machine learning practitioners, offering a practical remedy.

The paper tackles the problem of the curse of unrolling, where derivative iterates diverge from true Jacobians in algorithm unrolling, and shows that truncating early iterations mitigates this issue while reducing memory usage, with numerical experiments supporting the findings.

Algorithm unrolling is ubiquitous in machine learning, particularly in hyperparameter optimization and meta-learning, where Jacobians of solution mappings are computed by differentiating through iterative algorithms. Although unrolling is known to yield asymptotically correct Jacobians under suitable conditions, recent work has shown that the derivative iterates may initially diverge from the true Jacobian, a phenomenon known as the curse of unrolling. In this work, we provide a non-asymptotic analysis that explains the origin of this behavior and identifies the algorithmic factors that govern it. We show that truncating early iterations of the derivative computation mitigates the curse while simultaneously reducing memory requirements. Finally, we demonstrate that warm-starting in bilevel optimization naturally induces an implicit form of truncation, providing a practical remedy. Our theoretical findings are supported by numerical experiments on representative examples.

View on arXiv PDF

Similar