LGAIJun 17, 2025

ResNets Are Deeper Than You Think

arXiv:2506.14386v12 citationsh-index: 3
Originality Incremental advance
AI Analysis

This work addresses the fundamental understanding of residual connections in neural networks for researchers, suggesting they provide a deeper inductive bias rather than just optimization benefits.

The paper investigates whether residual networks (ResNets) offer advantages beyond improved trainability by comparing them to feedforward networks in a controlled post-training setting, finding that ResNets consistently outperform fixed-depth networks even when optimization differences are minimized.

Residual connections remain ubiquitous in modern neural network architectures nearly a decade after their introduction. Their widespread adoption is often credited to their dramatically improved trainability: residual networks train faster, more stably, and achieve higher accuracy than their feedforward counterparts. While numerous techniques, ranging from improved initialization to advanced learning rate schedules, have been proposed to close the performance gap between residual and feedforward networks, this gap has persisted. In this work, we propose an alternative explanation: residual networks do not merely reparameterize feedforward networks, but instead inhabit a different function space. We design a controlled post-training comparison to isolate generalization performance from trainability; we find that variable-depth architectures, similar to ResNets, consistently outperform fixed-depth networks, even when optimization is unlikely to make a difference. These results suggest that residual connections confer performance advantages beyond optimization, pointing instead to a deeper inductive bias aligned with the structure of natural data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes