Mini-batch Submodular Maximization
This addresses efficient optimization for large-scale datasets in machine learning, though it is incremental as it builds on existing submodular maximization methods.
The paper tackles the problem of maximizing non-negative monotone decomposable submodular functions under constraints by introducing the first mini-batch algorithm, showing that uniform sampling outperforms weighted sampling in practice and providing theoretical justification via smoothed analysis.
We present the first mini-batch algorithm for maximizing a non-negative monotone decomposable submodular function, $F=\sum_{i=1}^N f^i$, under a set of constraints. We consider two sampling approaches: uniform and weighted. We first show that mini-batch with weighted sampling improves over the state of the art sparsifier based approach both in theory and in practice. Surprisingly, our experimental results show that uniform sampling is superior to weighted sampling. However, it is impossible to explain this using worst-case analysis. Our main contribution is using smoothed analysis to provide a theoretical foundation for our experimental results. We show that, under very mild assumptions, uniform sampling is superior for both the mini-batch and the sparsifier approaches. We empirically verify that these assumptions hold for our datasets. Uniform sampling is simple to implement and has complexity independent of $N$, making it the perfect candidate to tackle massive real-world datasets.