LG MLJun 2, 2021

Partial Wasserstein Covering

Keisuke Kawano, Satoshi Koide, Keisuke Otaki

arXiv:2106.00886v35.54 citations

Originality Incremental advance

AI Analysis

This work addresses dataset coverage gaps for applications like autonomous driving, though it is incremental as it builds on existing optimal transport methods.

The paper tackles the problem of identifying missing patterns in a dataset compared to another by formulating it as partial Wasserstein covering, an NP-hard discrete optimization problem, and achieves a 0.63 approximation guarantee using a greedy algorithm, with experimental validation on real driving scenes datasets.

We consider a general task called partial Wasserstein covering with the goal of providing information on what patterns are not being taken into account in a dataset (e.g., dataset used during development) compared with another dataset(e.g., dataset obtained from actual applications). We model this task as a discrete optimization problem with partial Wasserstein divergence as an objective function. Although this problem is NP-hard, we prove that it satisfies the submodular property, allowing us to use a greedy algorithm with a 0.63 approximation. However, the greedy algorithm is still inefficient because it requires solving linear programming for each objective function evaluation. To overcome this inefficiency, we propose quasi-greedy algorithms that consist of a series of acceleration techniques, such as sensitivity analysis based on strong duality and the so-called C-transform in the optimal transport field. Experimentally, we demonstrate that we can efficiently fill in the gaps between the two datasets and find missing scene in real driving scenes datasets.

View on arXiv PDF

Similar