LGNov 15, 2024

On the Cost of Model-Serving Frameworks: An Experimental Evaluation

arXiv:2411.10337v13 citationsh-index: 232024 IEEE International Conference on Cloud Engineering (IC2E)
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of selecting efficient model-serving frameworks for practitioners deploying ML models in production, but it is incremental as it compares existing tools without introducing new methods.

The paper evaluated the performance of five model-serving frameworks across four scenarios, finding that TensorFlow Serving outperforms others for deep learning models and that DL-specific frameworks have significantly lower latencies than general-purpose ones.

In machine learning (ML), the inference phase is the process of applying pre-trained models to new, unseen data with the objective of making predictions. During the inference phase, end-users interact with ML services to gain insights, recommendations, or actions based on the input data. For this reason, serving strategies are nowadays crucial for deploying and managing models in production environments effectively. These strategies ensure that models are available, scalable, reliable, and performant for real-world applications, such as time series forecasting, image classification, natural language processing, and so on. In this paper, we evaluate the performances of five widely-used model serving frameworks (TensorFlow Serving, TorchServe, MLServer, MLflow, and BentoML) under four different scenarios (malware detection, cryptocoin prices forecasting, image classification, and sentiment analysis). We demonstrate that TensorFlow Serving is able to outperform all the other frameworks in serving deep learning (DL) models. Moreover, we show that DL-specific frameworks (TensorFlow Serving and TorchServe) display significantly lower latencies than the three general-purpose ML frameworks (BentoML, MLFlow, and MLServer).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes