How many samples to label for an application given a foundation model? Chest X-ray classification study
This work addresses the resource-intensive challenge of annotation in medical imaging for practitioners, enabling cost minimization by identifying essential samples for targeted performance, though it is incremental as it builds on existing foundation models.
The study tackled the problem of determining how many labeled samples are needed for chest X-ray classification using foundation models, finding that XrayCLIP and XraySigLIP achieve strong performance with significantly fewer examples than a ResNet-50 baseline, and learning curve slopes from just 50 labeled cases can accurately forecast final performance plateaus.
Chest X-ray classification is vital yet resource-intensive, typically demanding extensive annotated data for accurate diagnosis. Foundation models mitigate this reliance, but how many labeled samples are required remains unclear. We systematically evaluate the use of power-law fits to predict the training size necessary for specific ROC-AUC thresholds. Testing multiple pathologies and foundation models, we find XrayCLIP and XraySigLIP achieve strong performance with significantly fewer labeled examples than a ResNet-50 baseline. Importantly, learning curve slopes from just 50 labeled cases accurately forecast final performance plateaus. Our results enable practitioners to minimize annotation costs by labeling only the essential samples for targeted performance.