DS LG OCApr 19, 2019

Submodular Maximization Beyond Non-negativity: Guarantees, Fast Algorithms, and Applications

Christopher Harshaw, Moran Feldman, Justin Ward, Amin Karbasi

arXiv:1904.09354v119.5120 citationsHas Code

Originality Highly original

AI Analysis

This work addresses a fundamental limitation in submodular optimization for researchers and practitioners in machine learning and combinatorial optimization, offering new theoretical guarantees and efficient algorithms for broader function classes.

The paper tackles the problem of optimizing submodular functions beyond the non-negativity assumption by expressing them as a difference of a weakly submodular function and a modular function, achieving approximation guarantees with algorithms that run in time independent of the cardinality constraint. The results include algorithms with expected performance bounds of (1 - e^{-γ} - ε) g(OPT) - c(OPT) and runtimes of O(n/ε log^2 1/ε) for constrained and O(n/ε log 1/ε) for unconstrained settings, complemented by hardness results and empirical validation on datasets like Boston Housing and Email EU.

It is generally believed that submodular functions -- and the more general class of $γ$-weakly submodular functions -- may only be optimized under the non-negativity assumption $f(S) \geq 0$. In this paper, we show that once the function is expressed as the difference $f = g - c$, where $g$ is monotone, non-negative, and $γ$-weakly submodular and $c$ is non-negative modular, then strong approximation guarantees may be obtained. We present an algorithm for maximizing $g - c$ under a $k$-cardinality constraint which produces a random feasible set $S$ such that $\mathbb{E} \left[ g(S) - c(S) \right] \geq (1 - e^{-γ} - ε) g(OPT) - c(OPT)$, whose running time is $O (\frac{n}ε \log^2 \frac{1}ε)$, i.e., independent of $k$. We extend these results to the unconstrained setting by describing an algorithm with the same approximation guarantees and faster $O(\frac{n}ε \log\frac{1}ε)$ runtime. The main techniques underlying our algorithms are two-fold: the use of a surrogate objective which varies the relative importance between $g$ and $c$ throughout the algorithm, and a geometric sweep over possible $γ$ values. Our algorithmic guarantees are complemented by a hardness result showing that no polynomial-time algorithm which accesses $g$ through a value oracle can do better. We empirically demonstrate the success of our algorithms by applying them to experimental design on the Boston Housing dataset and directed vertex cover on the Email EU dataset.

View on arXiv PDF Code

Similar