LGDSFeb 20, 2018

Do Less, Get More: Streaming Submodular Maximization with Subsampling

arXiv:1802.07098v194 citations
Originality Highly original
AI Analysis

This work addresses scalability issues in machine learning for large-scale problems like video summarization, offering a novel combination of streaming and subsampling to reduce computational costs.

The paper tackles the problem of submodular maximization in streaming settings by developing a one-pass algorithm that subsamples data, achieving tight approximation guarantees with minimal memory and function evaluations, such as a 4p approximation for monotone functions and up to 50-fold speed improvements in video summarization.

In this paper, we develop the first one-pass streaming algorithm for submodular maximization that does not evaluate the entire stream even once. By carefully subsampling each element of data stream, our algorithm enjoys the tightest approximation guarantees in various settings while having the smallest memory footprint and requiring the lowest number of function evaluations. More specifically, for a monotone submodular function and a $p$-matchoid constraint, our randomized algorithm achieves a $4p$ approximation ratio (in expectation) with $O(k)$ memory and $O(km/p)$ queries per element ($k$ is the size of the largest feasible solution and $m$ is the number of matroids used to define the constraint). For the non-monotone case, our approximation ratio increases only slightly to $4p+2-o(1)$. To the best or our knowledge, our algorithm is the first that combines the benefits of streaming and subsampling in a novel way in order to truly scale submodular maximization to massive machine learning problems. To showcase its practicality, we empirically evaluated the performance of our algorithm on a video summarization application and observed that it outperforms the state-of-the-art algorithm by up to fifty fold, while maintaining practically the same utility.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes