DSLGNEMay 12, 2025

The Influence of the Memory Capacity of Neural DDEs on the Universal Approximation Property

arXiv:2505.07244v21 citationsh-index: 1
Originality Incremental advance
AI Analysis

This work addresses the theoretical understanding of memory in neural networks for researchers in machine learning, providing incremental insights into the conditions under which Neural DDEs can approximate continuous functions.

The paper investigates how the memory capacity, measured as the product of the Lipschitz constant and delay (Kτ), affects the universal approximation property of Neural Delay Differential Equations (Neural DDEs). It shows that non-augmented Neural DDEs lack this property when Kτ is small but achieve it when Kτ is sufficiently large, with augmented architectures expanding the parameter regions for universal approximation.

Neural Ordinary Differential Equations (Neural ODEs), which are the continuous-time analog of Residual Neural Networks (ResNets), have gained significant attention in recent years. Similarly, Neural Delay Differential Equations (Neural DDEs) can be interpreted as an infinite depth limit of Densely Connected Residual Neural Networks (DenseResNets). In contrast to traditional ResNet architectures, DenseResNets are feed-forward networks that allow for shortcut connections across all layers. These additional connections introduce memory in the network architecture, as typical in many modern architectures. In this work, we explore how the memory capacity in neural DDEs influences the universal approximation property. The key parameter for studying the memory capacity is the product $K τ$ of the Lipschitz constant and the delay of the DDE. In the case of non-augmented architectures, where the network width is not larger than the input and output dimensions, neural ODEs and classical feed-forward neural networks cannot have the universal approximation property. We show that if the memory capacity $Kτ$ is sufficiently small, the dynamics of the neural DDE can be approximated by a neural ODE. Consequently, non-augmented neural DDEs with a small memory capacity also lack the universal approximation property. In contrast, if the memory capacity $Kτ$ is sufficiently large, we can establish the universal approximation property of neural DDEs for continuous functions. If the neural DDE architecture is augmented, we can expand the parameter regions in which universal approximation is possible. Overall, our results show that by increasing the memory capacity $Kτ$, the infinite-dimensional phase space of DDEs with positive delay $τ>0$ is not sufficient to guarantee a direct jump transition to universal approximation, but only after a certain memory threshold, universal approximation holds.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes