LGOCDec 26, 2025

Why Smooth Stability Assumptions Fail for ReLU Learning

arXiv:2512.22055v1
Originality Highly original
AI Analysis

This work addresses a foundational issue in stability analysis for deep learning, highlighting the limitations of smooth approximations for ReLU networks and motivating nonsmooth-aware frameworks.

The paper demonstrates that smoothness assumptions, such as gradient Lipschitzness, fail globally for ReLU networks, even in empirically stable settings, by providing a concrete counterexample and identifying a minimal condition to restore stability.

Stability analyses of modern learning systems are frequently derived under smoothness assumptions that are violated by ReLU-type nonlinearities. In this note, we isolate a minimal obstruction by showing that no uniform smoothness-based stability proxy such as gradient Lipschitzness or Hessian control can hold globally for ReLU networks, even in simple settings where training trajectories appear empirically stable. We give a concrete counterexample demonstrating the failure of classical stability bounds and identify a minimal generalized derivative condition under which stability statements can be meaningfully restored. The result clarifies why smooth approximations of ReLU can be misleading and motivates nonsmooth-aware stability frameworks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes