CVJul 2, 2018

Active Testing: An Efficient and Robust Framework for Estimating Accuracy

Phuc Nguyen, Deva Ramanan, Charless Fowlkes

arXiv:1807.00493v17.819 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of scaling up evaluation for computer vision tasks like multi-label classification and instance segmentation, offering a more efficient alternative to current methods.

The paper tackles the problem of efficiently evaluating models on large-scale datasets with noisy labels by introducing an active testing framework that queries users to vet annotations, achieving significant savings in human annotation effort and improved robustness over existing protocols.

Much recent work on visual recognition aims to scale up learning to massive, noisily-annotated datasets. We address the problem of scaling- up the evaluation of such models to large-scale datasets with noisy labels. Current protocols for doing so require a human user to either vet (re-annotate) a small fraction of the test set and ignore the rest, or else correct errors in annotation as they are found through manual inspection of results. In this work, we re-formulate the problem as one of active testing, and examine strategies for efficiently querying a user so as to obtain an accu- rate performance estimate with minimal vetting. We demonstrate the effectiveness of our proposed active testing framework on estimating two performance metrics, Precision@K and mean Average Precision, for two popular computer vision tasks, multi-label classification and instance segmentation. We further show that our approach is able to save significant human annotation effort and is more robust than alternative evaluation protocols.

View on arXiv PDF

Similar