S. V. Kozyrev
Explanation of grokking (delayed generalization) in learning is given by modeling grokking by the stochastic gradient Langevin dynamics (Brownian motion) and applying the ideas of thermodynamics.
S. V. Kozyrev
Explanation of grokking (delayed generalization) in learning is given by modeling grokking by the stochastic gradient Langevin dynamics (Brownian motion) and applying the ideas of thermodynamics.
S. V. Kozyrev
We discuss a model of genome as a program with functional architecture and consider the approach to Darwinian evolution as a learning problem for functional programming. In particular we introduce a model of learning for some class of functional programs. This approach is related to information geometry -- the learning model uses some kind of distance in the information space (the reduction graph of the model), we consider statistical sum over paths in the reduction graph and discuss relation of this sum to temperature learning.
S. V. Kozyrev
We introduce a new procedure for training of artificial neural networks by using the approximation of an objective function by arithmetic mean of an ensemble of selected randomly generated neural networks, and apply this procedure to the classification (or pattern recognition) problem. This approach differs from the standard one based on the optimization theory. In particular, any neural network from the mentioned ensemble may not be an approximation of the objective function.