Training morphological neural networks with gradient descent: some theoretical insights
This work addresses the difficulty in training deep morphological networks for image processing, offering incremental theoretical guidance to improve optimization in this domain-specific area.
The paper investigates the challenges of training deep morphological neural networks using gradient descent, focusing on the limitations of differentiation and back-propagation due to non-smooth optimization, and provides theoretical insights and guidelines for initialization and learning rates.
Morphological neural networks, or layers, can be a powerful tool to boost the progress in mathematical morphology, either on theoretical aspects such as the representation of complete lattice operators, or in the development of image processing pipelines. However, these architectures turn out to be difficult to train when they count more than a few morphological layers, at least within popular machine learning frameworks which use gradient descent based optimization algorithms. In this paper we investigate the potential and limitations of differentiation based approaches and back-propagation applied to morphological networks, in light of the non-smooth optimization concept of Bouligand derivative. We provide insights and first theoretical guidelines, in particular regarding initialization and learning rates.