Fluctuation-dissipation Type Theorem in Stochastic Linear Learning
This provides a theoretical framework for analyzing stochastic gradient descent in linear models, which is incremental as it extends classical physics concepts to machine learning.
The paper tackled the problem of understanding stochastic linear learning dynamics by deriving a generalized fluctuation-dissipation theorem (FDT) for these systems, verifying its validity on datasets like MNIST, CIFAR-10, and EMNIST.
The fluctuation-dissipation theorem (FDT) is a simple yet powerful consequence of the first-order differential equation governing the dynamics of systems subject simultaneously to dissipative and stochastic forces. The linear learning dynamics, in which the input vector maps to the output vector by a linear matrix whose elements are the subject of learning, has a stochastic version closely mimicking the Langevin dynamics when a full-batch gradient descent scheme is replaced by that of stochastic gradient descent. We derive a generalized FDT for the stochastic linear learning dynamics and verify its validity among the well-known machine learning data sets such as MNIST, CIFAR-10 and EMNIST.