LG MLSep 7, 2018

Beyond Gradient Descent for Regularized Segmentation Losses

Dmitrii Marin, Meng Tang, Ismail Ben Ayed, Yuri Boykov

arXiv:1809.02322v28.711 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses optimization challenges in neural network training for segmentation, suggesting a shift in focus from loss/architecture design to optimizer choice, though it is incremental as it builds on known regularization models.

The paper tackles the problem of training weakly-supervised CNN segmentation with a loss function where gradient descent performs poorly, and shows that an alternative optimizer (ADM) achieves state-of-the-art results, while GD works best only with a smoother loss tuning.

The simplicity of gradient descent (GD) made it the default method for training ever-deeper and complex neural networks. Both loss functions and architectures are often explicitly tuned to be amenable to this basic local optimization. In the context of weakly-supervised CNN segmentation, we demonstrate a well-motivated loss function where an alternative optimizer (ADM) achieves the state-of-the-art while GD performs poorly. Interestingly, GD obtains its best result for a "smoother" tuning of the loss function. The results are consistent across different network architectures. Our loss is motivated by well-understood MRF/CRF regularization models in "shallow" segmentation and their known global solvers. Our work suggests that network design/training should pay more attention to optimization methods.

View on arXiv PDF Code

Similar