An axiomatized PDE model of deep neural networks
This provides a mathematical foundation for interpreting neural networks, potentially benefiting researchers in theoretical ML.
The authors tackled the problem of understanding deep neural networks by formulating them as evolution operators and proving they are determined by convection-diffusion equations, which improves robustness and reduces Rademacher complexity while enabling a new ResNet training method validated by experiments.
Inspired by the relation between deep neural network (DNN) and partial differential equations (PDEs), we study the general form of the PDE models of deep neural networks. To achieve this goal, we formulate DNN as an evolution operator from a simple base model. Based on several reasonable assumptions, we prove that the evolution operator is actually determined by convection-diffusion equation. This convection-diffusion equation model gives mathematical explanation for several effective networks. Moreover, we show that the convection-diffusion model improves the robustness and reduces the Rademacher complexity. Based on the convection-diffusion equation, we design a new training method for ResNets. Experiments validate the performance of the proposed method.