Best of Both Worlds: Practical and Theoretically Optimal Submodular Maximization in Parallel
This work addresses a fundamental optimization problem in machine learning and AI, with applications in feature selection, summarization, and recommendation systems, by providing a practical and theoretically optimal solution that is incremental in improving upon existing methods.
The paper tackles the problem of maximizing a monotone submodular function under a cardinality constraint by developing an algorithm that achieves state-of-the-art empirical performance and theoretical properties, including query complexity of O(n), adaptivity of O(log(n)), and an approximation ratio of nearly 1-1/e, while outperforming the previous best algorithm FAST in runtime, adaptive rounds, queries, and objective values across six functions.
For the problem of maximizing a monotone, submodular function with respect to a cardinality constraint $k$ on a ground set of size $n$, we provide an algorithm that achieves the state-of-the-art in both its empirical performance and its theoretical properties, in terms of adaptive complexity, query complexity, and approximation ratio; that is, it obtains, with high probability, query complexity of $O(n)$ in expectation, adaptivity of $O(\log(n))$, and approximation ratio of nearly $1-1/e$. The main algorithm is assembled from two components which may be of independent interest. The first component of our algorithm, LINEARSEQ, is useful as a preprocessing algorithm to improve the query complexity of many algorithms. Moreover, a variant of LINEARSEQ is shown to have adaptive complexity of $O( \log (n / k) )$ which is smaller than that of any previous algorithm in the literature. The second component is a parallelizable thresholding procedure THRESHOLDSEQ for adding elements with gain above a constant threshold. Finally, we demonstrate that our main algorithm empirically outperforms, in terms of runtime, adaptive rounds, total queries, and objective values, the previous state-of-the-art algorithm FAST in a comprehensive evaluation with six submodular objective functions.