ML LGMar 16, 2017

End-to-End Learning for Structured Prediction Energy Networks

David Belanger, Bishan Yang, Andrew McCallum

arXiv:1703.05667v221.1136 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of efficient and accurate structured prediction for tasks like image processing and NLP, though it appears incremental by building on existing SPEN frameworks.

The paper tackles the problem of training Structured Prediction Energy Networks (SPENs) end-to-end, enabling the use of more sophisticated non-convex energy functions, and demonstrates improved accuracy over baseline methods on image denoising and semantic role labeling tasks.

Structured Prediction Energy Networks (SPENs) are a simple, yet expressive family of structured prediction models (Belanger and McCallum, 2016). An energy function over candidate structured outputs is given by a deep network, and predictions are formed by gradient-based optimization. This paper presents end-to-end learning for SPENs, where the energy function is discriminatively trained by back-propagating through gradient-based prediction. In our experience, the approach is substantially more accurate than the structured SVM method of Belanger and McCallum (2016), as it allows us to use more sophisticated non-convex energies. We provide a collection of techniques for improving the speed, accuracy, and memory requirements of end-to-end SPENs, and demonstrate the power of our method on 7-Scenes image denoising and CoNLL-2005 semantic role labeling tasks. In both, inexact minimization of non-convex SPEN energies is superior to baseline methods that use simplistic energy functions that can be minimized exactly.

View on arXiv PDF

Similar