LGOct 25, 2022
Proximal Mean Field Learning in Shallow Neural NetworksAlexis Teter, Iman Nodozi, Abhishek Halder
We propose a custom learning algorithm for shallow over-parameterized neural networks, i.e., networks with single hidden layer having infinite width. The infinite width of the hidden layer serves as an abstraction for the over-parameterization. Building on the recent mean field interpretations of learning dynamics in shallow neural networks, we realize mean field learning as a computational algorithm, rather than as an analytical tool. Specifically, we design a Sinkhorn regularized proximal algorithm to approximate the distributional flow for the learning dynamics over weighted point clouds. In this setting, a contractive fixed point recursion computes the time-varying weights, numerically realizing the interacting Wasserstein gradient flow of the parameter distribution supported over the neuronal ensemble. An appealing aspect of the proposed algorithm is that the measure-valued recursions allow meshless computation. We demonstrate the proposed computational framework of interacting weighted particle evolution on binary and multi-class classification. Our algorithm performs gradient descent of the free energy associated with the risk functional.
OCMar 22, 2025
On the Hopf-Cole Transform for Control-affine Schrödinger BridgeAlexis Teter, Abhishek Halder
The purpose of this note is to clarify the importance of the relation $\boldsymbol{gg}^{\top}\propto \boldsymbol{σσ}^{\top}$ in solving control-affine Schrödinger bridge problems via the Hopf-Cole transform, where $\boldsymbol{g},\boldsymbolσ$ are the control and noise coefficients, respectively. We show that the Hopf-Cole transform applied to the conditions of optimality for generic control-affine Schrödinger bridge problems, i.e., without the assumption $\boldsymbol{gg}^{\top}\propto\boldsymbol{σσ}^{\top}$, gives a pair of forward-backward PDEs that are neither linear nor equation-level decoupled. We explain how the resulting PDEs can be interpreted as nonlinear forward-backward advection-diffusion-reaction equations, where the nonlinearity stem from additional drift and reaction terms involving the gradient of the log-likelihood a.k.a. the score. These additional drift and reaction vanish when $\boldsymbol{gg}^{\top}\propto\boldsymbol{σσ}^{\top}$, and the resulting boundary-coupled system of linear PDEs can then be solved by dynamic Sinkhorn recursions. A key takeaway of our work is that the numerical solution of the generic control-affine Schrödinger bridge requires further algorithmic development, possibly generalizing the dynamic Sinkhorn recursion or otherwise.