CLJan 17, 2023

Which Model Shall I Choose? Cost/Quality Trade-offs for Text Classification Tasks

Shi Zong, Josh Seltzer, Jiahua, Pan, Kathy Cheng, Jimmy Lin

arXiv:2301.07006v11.74 citationsh-index: 87

Originality Synthesis-oriented

AI Analysis

This work addresses a practical deployment challenge for industry practitioners in text classification, though it is incremental as it applies existing cost analysis methods to this specific domain.

The paper tackles the problem of selecting optimal models for text classification deployment by quantitatively analyzing the trade-offs between classification accuracy and various costs (annotation, training, inference). It evaluates multiple models, including large language models, to provide guidance for scenarios like high-volume inference.

Industry practitioners always face the problem of choosing the appropriate model for deployment under different considerations, such as to maximize a metric that is crucial for production, or to reduce the total cost given financial concerns. In this work, we focus on the text classification task and present a quantitative analysis for this challenge. Using classification accuracy as the main metric, we evaluate the classifiers' performances for a variety of models, including large language models, along with their associated costs, including the annotation cost, training (fine-tuning) cost, and inference cost. We then discuss the model choices for situations like having a large number of samples needed for inference. We hope our work will help people better understand the cost/quality trade-offs for the text classification task.

View on arXiv PDF

Similar