LGCCAPNAAug 11, 2023

Size Lowerbounds for Deep Operator Networks

arXiv:2308.06338v37 citationsh-index: 6
Originality Incremental advance
AI Analysis

This work provides foundational theoretical insights into the scalability of DeepONets for solving PDEs, which is crucial for researchers and practitioners in scientific machine learning, though it is incremental as it builds on existing DeepONet paradigms.

The paper tackles the problem of determining the minimum size required for Deep Operator Networks (DeepONets) to achieve low empirical error on noisy data, establishing a data-dependent lower bound that the common output dimension must scale as Ω(n^(1/4)) for n data points. It also experimentally shows that for fixed model size, reducing training error by increasing this dimension may require training data to scale quadratically with it, as demonstrated on an advection-diffusion-reaction PDE.

Deep Operator Networks are an increasingly popular paradigm for solving regression in infinite dimensions and hence solve families of PDEs in one shot. In this work, we aim to establish a first-of-its-kind data-dependent lowerbound on the size of DeepONets required for them to be able to reduce empirical error on noisy data. In particular, we show that for low training errors to be obtained on $n$ data points it is necessary that the common output dimension of the branch and the trunk net be scaling as $Ω\left ( \sqrt[\leftroot{-1}\uproot{-1}4]{n} \right )$. This inspires our experiments with DeepONets solving the advection-diffusion-reaction PDE, where we demonstrate the possibility that at a fixed model size, to leverage increase in this common output dimension and get monotonic lowering of training error, the size of the training data might necessarily need to scale at least quadratically with it.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes