LG MLFeb 14, 2012

Active Semi-Supervised Learning using Submodular Functions

arXiv:1202.3726v142 citations

Originality Incremental advance

AI Analysis

This work addresses active learning challenges for researchers in machine learning, offering a flexible framework with theoretical guarantees, though it is incremental as it builds on existing error bounds.

The paper tackles the problem of active semi-supervised learning in an offline transductive setting by generalizing an error bound using submodular functions, showing it is NP-complete to minimize exactly but providing an approximate method, with experiments on real data supporting the theoretical results.

We consider active, semi-supervised learning in an offline transductive setting. We show that a previously proposed error bound for active learning on undirected weighted graphs can be generalized by replacing graph cut with an arbitrary symmetric submodular function. Arbitrary non-symmetric submodular functions can be used via symmetrization. Different choices of submodular functions give different versions of the error bound that are appropriate for different kinds of problems. Moreover, the bound is deterministic and holds for adversarially chosen labels. We show exactly minimizing this error bound is NP-complete. However, we also introduce for any submodular function an associated active semi-supervised learning method that approximately minimizes the corresponding error bound. We show that the error bound is tight in the sense that there is no other bound of the same form which is better. Our theoretical results are supported by experiments on real data.

View on arXiv PDF

Similar