Classes of ODE solutions: smoothness, covering numbers, implications for noisy function fitting, and the curse of smoothness phenomenon
This work provides a theoretical foundation for numerical methods used in recovering ODE solutions from data, particularly for researchers and practitioners relying on basis or kernel function approximations, by quantifying the inherent complexity and limitations due to solution smoothness.
This paper investigates the smoothness and covering numbers of ODE solution classes, revealing that even with bounded derivatives of the function $f$, the $(k+1)$th derivative of the solution can grow factorially fast in $k$, a phenomenon termed "curse of smoothness." It also establishes upper bounds for covering numbers of solution classes that are larger than standard smooth function classes and provides convergence rates for least squares fitting of noisy ODE solutions, showing a minimax optimal rate of $(\frac{1}{n})^{\frac{2(\beta+2)}{2(\beta+2)+1}}$ under certain conditions.
Many numerical methods for recovering ODE solutions from data rely on approximating the solutions using basis functions or kernel functions under a least square criterion. The accuracy of this approach hinges on the smoothness of the solutions. This paper provides a theoretical foundation for these methods by establishing novel results on the smoothness and covering numbers of ODE solution classes (as a measure of their "size"). Our results provide answers to "how do the degree of smoothness and the "size" of a class of ODEs affect the "size" of the associated class of solutions?" We show that: (1) for $y^{'}=f\left(y\right)$ and $y^{'}=f\left(x,\,y\right)$, if the absolute values of all $k$th ($k\leqβ+1$) order derivatives of $f$ are bounded by $1$, then the solution can end up with the $(k+1)$th derivative whose magnitude grows factorially fast in $k$ -- "a curse of smoothness"; (2) our upper bounds for the covering numbers of the $(β+2)-$degree smooth solution classes are greater than those of the "standard" $(β+2)-$degree smooth class of univariate functions; (3) the mean squared error of least squares fitting for noisy recovery has a convergence rate no larger than $\left(\frac{1}{n}\right)^{\frac{2\left(β+2\right)}{2\left(β+2\right)+1}}$ if $n=Ω\left(\left(β\sqrt{\log\left(β\vee1\right)}\right)^{4β+10}\right)$, and under this condition, the rate $\left(\frac{1}{n}\right)^{\frac{2\left(β+2\right)}{2\left(β+2\right)+1}}$ is minimax optimal in the case of $y^{'}=f\left(x,\,y\right)$; (4) more generally, for the higher order Picard type ODEs, $y^{\left(m\right)}=f\left(x,\,y,\,y^{'},\,...,y^{\left(m-1\right)}\right)$, the covering number of the solution class is bounded from above by the product of the covering number of the class $\mathcal{F}$ that $f$ ranges over and the covering number of the set where initial values lie.