Deep Structured Prediction with Nonlinear Output Transformations
This work addresses computational and optimization challenges in deep structured prediction for tasks like semantic segmentation, but it appears incremental as it builds on and generalizes prior methods.
The paper tackles the limitations of deep structured models, such as restricted local neighborhoods and inability to perform non-linear output transformations, by developing a novel model that generalizes existing approaches like structured prediction energy networks while maintaining applicability of existing inference techniques.
Deep structured models are widely used for tasks like semantic segmentation, where explicit correlations between variables provide important prior information which generally helps to reduce the data needs of deep nets. However, current deep structured models are restricted by oftentimes very local neighborhood structure, which cannot be increased for computational complexity reasons, and by the fact that the output configuration, or a representation thereof, cannot be transformed further. Very recent approaches which address those issues include graphical model inference inside deep nets so as to permit subsequent non-linear output space transformations. However, optimization of those formulations is challenging and not well understood. Here, we develop a novel model which generalizes existing approaches, such as structured prediction energy networks, and discuss a formulation which maintains applicability of existing inference techniques.