Treedy: A Heuristic for Counting and Sampling Subsets
This addresses computational bottlenecks in applications like Bayesian structure discovery, offering incremental improvements for specific domains.
The paper tackles the problem of efficiently approximating weighted sums and sampling subsets of a query set, introducing Treedy, a tree-based greedy heuristic that guarantees relative error and total variation distance within a tolerance d. Experimental results show dramatic savings in running time compared to exact computation and outperformance over a previous sorting-based heuristic.
Consider a collection of weighted subsets of a ground set N. Given a query subset Q of N, how fast can one (1) find the weighted sum over all subsets of Q, and (2) sample a subset of Q proportionally to the weights? We present a tree-based greedy heuristic, Treedy, that for a given positive tolerance d answers such counting and sampling queries to within a guaranteed relative error d and total variation distance d, respectively. Experimental results on artificial instances and in application to Bayesian structure discovery in Bayesian networks show that approximations yield dramatic savings in running time compared to exact computation, and that Treedy typically outperforms a previously proposed sorting-based heuristic.