Jean Thierry-Mieg

LGDec 18, 2018

XOR_p A maximally intertwined p-classes problem used as a benchmark with built-in truth for neural networks gradient descent optimization

Danielle Thierry-Mieg, Jean Thierry-Mieg

A natural p-classes generalization of the eXclusive OR problem, the subtraction modulo p, where p is prime, is presented and solved using a single fully connected hidden layer with p-neurons. Although the problem is very simple, the landscape is intricate and challenging and represents an interesting benchmark for gradient descent optimization algorithms. Testing 9 optimizers and 9 activation functions up to p = 191, the method converging most often and the fastest to a perfect classification is the Adam optimizer combined with the ELU activation function.

LGNov 1, 2018

Connections between physics, mathematics and deep learning

Jean Thierry-Mieg

Starting from the Fermat's principle of least action, which governs classical and quantum mechanics and from the theory of exterior differential forms, which governs the geometry of curved manifolds, we show how to derive the equations governing neural networks in an intrinsic, coordinate invariant way, where the loss function plays the role of the Hamiltonian. To be covariant, these equations imply a layer metric which is instrumental in pretraining and explains the role of conjugation when using complex numbers. The differential formalism also clarifies the relation of the gradient descent optimizer with Aristotelian and Newtonian mechanics and why large learning steps break the logic of the linearization procedure. We hope that this formal presentation of the differential geometry of neural networks will encourage some physicists to dive into deep learning, and reciprocally, that the specialists of deep learning will better appreciate the close interconnection of their subject with the foundations of classical and quantum field theory.

Jean Thierry-Mieg

2 Papers