DSLGPRJul 6, 2016

On Sampling and Greedy MAP Inference of Constrained Determinantal Point Processes

arXiv:1607.01551v117 citations
Originality Incremental advance
AI Analysis

This work addresses subset selection challenges in machine learning and data analysis by enhancing diversity and flexibility in constrained DPPs, though it is incremental in extending existing methods to specific constraints.

The paper tackles the problem of sampling from Determinantal Point Processes (DPPs) under partition constraints, presenting the first polynomial-time exact sampling algorithm for any constant number of partitions, while also showing a complexity barrier for general matroid constraints. It also improves approximation guarantees for MAP inference in k-DPPs using a greedy initialization and local search, with experiments indicating significant gains for larger k values.

Subset selection problems ask for a small, diverse yet representative subset of the given data. When pairwise similarities are captured by a kernel, the determinants of submatrices provide a measure of diversity or independence of items within a subset. Matroid theory gives another notion of independence, thus giving rise to optimization and sampling questions about Determinantal Point Processes (DPPs) under matroid constraints. Partition constraints, as a special case, arise naturally when incorporating additional labeling or clustering information, besides the kernel, in DPPs. Finding the maximum determinant submatrix under matroid constraints on its row/column indices has been previously studied. However, the corresponding question of sampling from DPPs under matroid constraints has been unresolved, beyond the simple cardinality constrained k-DPPs. We give the first polynomial time algorithm to sample exactly from DPPs under partition constraints, for any constant number of partitions. We complement this by a complexity theoretic barrier that rules out such a result under general matroid constraints. Our experiments indicate that partition-constrained DPPs offer more flexibility and more diversity than k-DPPs and their naive extensions, while being reasonably efficient in running time. We also show that a simple greedy initialization followed by local search gives improved approximation guarantees for the problem of MAP inference from k- DPPs on well-conditioned kernels. Our experiments show that this improvement is significant for larger values of k, supporting our theoretical result.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes