Why to "grow" and "harvest" deep learning models?
This work addresses the need for more transparent and efficient training methods in deep learning, but it appears incremental as it builds on existing gradient-based techniques with a novel analogy.
The paper tackles the problem of improving transparency, convergence rates, and inductive biases in deep learning by proposing a population dynamics-inspired approach of 'growth' and 'harvesting'. It shows that this method outperforms common adaptive gradient methods in all three requirements, though specific numbers are not provided.
Current expectations from training deep learning models with gradient-based methods include: 1) transparency; 2) high convergence rates; 3) high inductive biases. While the state-of-art methods with adaptive learning rate schedules are fast, they still fail to meet the other two requirements. We suggest reconsidering neural network models in terms of single-species population dynamics where adaptation comes naturally from open-ended processes of "growth" and "harvesting". We show that the stochastic gradient descent (SGD) with two balanced pre-defined values of per capita growth and harvesting rates outperform the most common adaptive gradient methods in all of the three requirements.