MLLGFeb 13, 2021

Learning from Similarity-Confidence Data

arXiv:2102.06879v128 citations
Originality Incremental advance
AI Analysis

This work addresses a novel weakly supervised learning problem to reduce labeling costs, though it appears incremental as it builds on existing risk estimation frameworks.

The paper tackles the problem of learning binary classifiers from unlabeled data pairs with similarity confidence, proposing an unbiased risk estimator that achieves optimal convergence rate and includes a correction scheme to prevent overfitting.

Weakly supervised learning has drawn considerable attention recently to reduce the expensive time and labor consumption of labeling massive data. In this paper, we investigate a novel weakly supervised learning problem of learning from similarity-confidence (Sconf) data, where we aim to learn an effective binary classifier from only unlabeled data pairs equipped with confidence that illustrates their degree of similarity (two examples are similar if they belong to the same class). To solve this problem, we propose an unbiased estimator of the classification risk that can be calculated from only Sconf data and show that the estimation error bound achieves the optimal convergence rate. To alleviate potential overfitting when flexible models are used, we further employ a risk correction scheme on the proposed risk estimator. Experimental results demonstrate the effectiveness of the proposed methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes