LGDec 15, 2015

A Light Touch for Heavily Constrained SGD

arXiv:1512.04960v225 citations
Originality Incremental advance
AI Analysis

This addresses scalability issues for practitioners dealing with large constraint sets in optimization, though it appears incremental as it builds on existing SGD methods.

The paper tackles the problem of efficiently optimizing heavily-constrained objectives in machine learning, such as learning monotonic or submodular functions, by proposing an extension of SGD that applies constraints probabilistically, resulting in a theoretical trade-off between per-iteration work and iteration count.

Minimizing empirical risk subject to a set of constraints can be a useful strategy for learning restricted classes of functions, such as monotonic functions, submodular functions, classifiers that guarantee a certain class label for some subset of examples, etc. However, these restrictions may result in a very large number of constraints. Projected stochastic gradient descent (SGD) is often the default choice for large-scale optimization in machine learning, but requires a projection after each update. For heavily-constrained objectives, we propose an efficient extension of SGD that stays close to the feasible region while only applying constraints probabilistically at each iteration. Theoretical analysis shows a compelling trade-off between per-iteration work and the number of iterations needed on problems with a large number of constraints.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes