LGFeb 21, 2018

Active Learning with Partial Feedback

Peiyun Hu, Zachary C. Lipton, Anima Anandkumar, Deva Ramanan

arXiv:1802.07427v416.572 citationsHas Code

Originality Highly original

AI Analysis

This addresses the annotation cost problem for machine learning practitioners in multiclass classification tasks, offering a more efficient approach than standard methods.

The paper tackles the problem of active learning in a realistic setting where annotation involves binary yes/no questions instead of direct class labels, proposing Active Learning with Partial Feedback (ALPF) to actively choose examples and binary questions. Experiments on Tiny ImageNet show a 26% relative improvement in top-1 accuracy compared to baselines with 30% of the annotation budget and a 42% lower cost for full annotation.

While many active learning papers assume that the learner can simply ask for a label and receive it, real annotation often presents a mismatch between the form of a label (say, one among many classes), and the form of an annotation (typically yes/no binary feedback). To annotate examples corpora for multiclass classification, we might need to ask multiple yes/no questions, exploiting a label hierarchy if one is available. To address this more realistic setting, we propose active learning with partial feedback (ALPF), where the learner must actively choose both which example to label and which binary question to ask. At each step, the learner selects an example, asking if it belongs to a chosen (possibly composite) class. Each answer eliminates some classes, leaving the learner with a partial label. The learner may then either ask more questions about the same example (until an exact label is uncovered) or move on immediately, leaving the first example partially labeled. Active learning with partial labels requires (i) a sampling strategy to choose (example, class) pairs, and (ii) learning from partial labels between rounds. Experiments on Tiny ImageNet demonstrate that our most effective method improves 26% (relative) in top-1 classification accuracy compared to i.i.d. baselines and standard active learners given 30% of the annotation budget that would be required (naively) to annotate the dataset. Moreover, ALPF-learners fully annotate TinyImageNet at 42% lower cost. Surprisingly, we observe that accounting for per-example annotation costs can alter the conventional wisdom that active learners should solicit labels for hard examples.

View on arXiv PDF Code

Similar