CL LGMar 19, 2021

Cost-effective Deployment of BERT Models in Serverless Environment

Katarína Benešová, Andrej Švec, Marek Šuppa

arXiv:2103.10673v231.7729 citations

Originality Synthesis-oriented

AI Analysis

This provides a cost-effective solution for small-to-medium deployments of BERT models in serverless settings, though it is incremental as it applies existing techniques to a specific deployment challenge.

The study tackled deploying BERT models in serverless environments by using knowledge distillation and fine-tuning on proprietary datasets for sentiment analysis and semantic textual similarity, resulting in models with acceptable latency for production use and cost-effectiveness for small-to-medium deployments.

In this study we demonstrate the viability of deploying BERT-style models to serverless environments in a production setting. Since the freely available pre-trained models are too large to be deployed in this way, we utilize knowledge distillation and fine-tune the models on proprietary datasets for two real-world tasks: sentiment analysis and semantic textual similarity. As a result, we obtain models that are tuned for a specific domain and deployable in serverless environments. The subsequent performance analysis shows that this solution results in latency levels acceptable for production use and that it is also a cost-effective approach for small-to-medium size deployments of BERT models, all without any infrastructure overhead.

View on arXiv PDF

Similar