LGCVSep 30, 2024

Fine-tuning Vision Classifiers On A Budget

Amazon
arXiv:2410.00085v1h-index: 5
Originality Incremental advance
AI Analysis

This work provides an incremental improvement for machine learning practitioners and researchers who need to fine-tune vision models efficiently when ground truth labels are scarce and multiple, potentially inaccurate, human labels are available.

This paper addresses the problem of fine-tuning vision classifiers with limited budgets by leveraging multiple labels from labelers of varying accuracy. By estimating true labels using a naive-Bayes model and prior labeler accuracy, the method, Ground Truth Extension (GTX), allows for labeling more data on a fixed budget without sacrificing label or fine-tuning quality, as demonstrated on an industrial image dataset.

Fine-tuning modern computer vision models requires accurately labeled data for which the ground truth may not exist, but a set of multiple labels can be obtained from labelers of variable accuracy. We tie the notion of label quality to confidence in labeler accuracy and show that, when prior estimates of labeler accuracy are available, using a simple naive-Bayes model to estimate the true labels allows us to label more data on a fixed budget without compromising label or fine-tuning quality. We present experiments on a dataset of industrial images that demonstrates that our method, called Ground Truth Extension (GTX), enables fine-tuning ML models using fewer human labels.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes