DSJul 24, 2018
Adjoint shadowing directions in hyperbolic systems for sensitivity analysisAngxiu Ni
For hyperbolic diffeomorphisms, we define adjoint shadowing directions as a bounded inhomogeneous adjoint solution whose initial condition has zero component in the unstable adjoint direction. For hyperbolic flows, we define adjoint shadowing directions similarly, with the additional requirement that the average of its inner-product with the trajectory direction is zero. In both cases, we show unique existence of adjoint shadowing directions, and how they can be used for adjoint sensitivity analysis. Our work set a theoretical foundation for efficient adjoint sensitivity methods for long-time-averaged objectives such as NILSAS.
PRApr 30
Path-Kernel Method for Differentiating Unstable DiffusionsAngxiu Ni
We derive and prove the path-kernel formula for the linear response (parameter-derivative of averaged statistics) of SDEs. The parameter may affect the drift coefficient, the diffusion coefficient, and the initial condition. The formula tempers the unstableness by gradually moving the derivative from path-perturbation to kernel-differentiation, without assuming hyperbolicity. We prove it by direct comparison of bundles of paths across different parameter values. We also derive a pathwise Monte Carlo algorithm for estimating linear responses and demonstrate it on the 40-dimensional noisy Lorenz--96 system. Our result provides a new computational tool for optimization, and has already led to a follow-up application to data assimilation.
DSSep 4, 2025
Divergence-Kernel method for linear responses and diffusion modelsAngxiu Ni
We derive the divergence-kernel formula for the linear response (parameter-derivative of marginal or stationary distributions) of random dynamical systems, and formally pass to the continuous-time limit. Our formula works for multiplicative and parameterized noise over any period of time; it does not require hyperbolicity. Then we derive a pathwise Monte-Carlo algorithm for linear responses. With this, we propose a forward-only diffusion generative model and test on simple problems.
OCMay 11, 2019
Linear Range in Gradient DescentAngxiu Ni, Chaitanya Talnikar
This paper defines linear range as the range of parameter perturbations which lead to approximately linear perturbations in the states of a network. We compute linear range from the difference between actual perturbations in states and the tangent solution. Linear range is a new criterion for estimating the effectivenss of gradients and thus having many possible applications. In particular, we propose that the optimal learning rate at the initial stages of training is such that parameter changes on all minibatches are within linear range. We demonstrate our algorithm on two shallow neural networks and a ResNet.