MLLGApr 12, 2016

A Convex Surrogate Operator for General Non-Modular Loss Functions

arXiv:1604.03373v18 citations
Originality Incremental advance
AI Analysis

This provides a tractable solution for machine learning tasks involving general non-modular loss functions, which is incremental as it extends existing methods to a broader class of functions.

The paper tackled the problem of optimizing non-modular loss functions in empirical risk minimization by introducing a novel convex surrogate based on submodular-supermodular decomposition, achieving improved performance and scalability on real-world datasets with tens of thousands of frames.

Empirical risk minimization frequently employs convex surrogates to underlying discrete loss functions in order to achieve computational tractability during optimization. However, classical convex surrogates can only tightly bound modular loss functions, sub-modular functions or supermodular functions separately while maintaining polynomial time computation. In this work, a novel generic convex surrogate for general non-modular loss functions is introduced, which provides for the first time a tractable solution for loss functions that are neither super-modular nor submodular. This convex surro-gate is based on a submodular-supermodular decomposition for which the existence and uniqueness is proven in this paper. It takes the sum of two convex surrogates that separately bound the supermodular component and the submodular component using slack-rescaling and the Lov{á}sz hinge, respectively. It is further proven that this surrogate is convex , piecewise linear, an extension of the loss function, and for which subgradient computation is polynomial time. Empirical results are reported on a non-submodular loss based on the S{ø}rensen-Dice difference function, and a real-world face track dataset with tens of thousands of frames, demonstrating the improved performance, efficiency, and scalabil-ity of the novel convex surrogate.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes