LG OCOct 15, 2025

Convergence, design and training of continuous-time dropout as a random batch method

arXiv:2510.13134v14.1h-index: 2

Originality Incremental advance

AI Analysis

This work addresses regularization and computational efficiency in continuous-time neural networks, offering incremental improvements through a novel mathematical framework.

The paper tackles dropout regularization in continuous-time models by framing it as a random-batch method, establishing convergence with linear rates in time intervals and stability with total-variation error of order h^{1/2}, and validating the approach on classification and flow matching tasks with observed regularization effects and improved runtime and memory.

We study dropout regularization in continuous-time models through the lens of random-batch methods -- a family of stochastic sampling schemes originally devised to reduce the computational cost of interacting particle systems. We construct an unbiased, well-posed estimator that mimics dropout by sampling neuron batches over time intervals of length $h$. Trajectory-wise convergence is established with linear rate in $h$ for the expected uniform error. At the distribution level, we establish stability for the associated continuity equation, with total-variation error of order $h^{1/2}$ under mild moment assumptions. During training with fixed batch sampling across epochs, a Pontryagin-based adjoint analysis bounds deviations in the optimal cost and control, as well as in gradient-descent iterates. On the design side, we compare convergence rates for canonical batch sampling schemes, recover standard Bernoulli dropout as a special case, and derive a cost--accuracy trade-off yielding a closed-form optimal $h$. We then specialize to a single-layer neural ODE and validate the theory on classification and flow matching, observing the predicted rates, regularization effects, and favorable runtime and memory profiles.

View on arXiv PDF

Similar