Data Augmentation vs. Equivariant Networks: A Theory of Generalization on Dynamics Forecasting
This work addresses a theoretical gap in exploiting symmetry for non-stationary dynamics forecasting, which is incremental as it builds on prior i.i.d. theories.
The paper tackles the problem of understanding how data augmentation and equivariant networks improve generalization in deep learning for dynamics forecasting, deriving generalization bounds to characterize their effects in a unified framework.
Exploiting symmetry in dynamical systems is a powerful way to improve the generalization of deep learning. The model learns to be invariant to transformation and hence is more robust to distribution shift. Data augmentation and equivariant networks are two major approaches to injecting symmetry into learning. However, their exact role in improving generalization is not well understood. In this work, we derive the generalization bounds for data augmentation and equivariant networks, characterizing their effect on learning in a unified framework. Unlike most prior theories for the i.i.d. setting, we focus on non-stationary dynamics forecasting with complex temporal dependencies.