The Natural Gradient by Analogy to Signal Whitening, and Recipes and Tricks for its Use
This is an incremental tutorial aimed at researchers and practitioners in machine learning to simplify a complex concept.
The paper tackles the challenge of understanding and applying the natural gradient for more efficient gradient descent by providing intuitive explanations and practical recipes, without presenting new experimental results or concrete performance numbers.
The natural gradient allows for more efficient gradient descent by removing dependencies and biases inherent in a function's parameterization. Several papers present the topic thoroughly and precisely. It remains a very difficult idea to get your head around however. The intent of this note is to provide simple intuition for the natural gradient and its use. We review how an ill conditioned parameter space can undermine learning, introduce the natural gradient by analogy to the more widely understood concept of signal whitening, and present tricks and specific prescriptions for applying the natural gradient to learning problems.