CheapET-3: Cost-Efficient Use of Remote DNN Models
This addresses cost efficiency for applications relying on third-party DNN services, though it is incremental as it builds on existing hybrid model approaches.
The paper tackles the high monetary cost of using large remote DNN models for predictions by proposing a client-side architecture that combines a small local DNN with a remote model, reducing prediction cost by up to 50% without accuracy loss.
On complex problems, state of the art prediction accuracy of Deep Neural Networks (DNN) can be achieved using very large-scale models, consisting of billions of parameters. Such models can only be run on dedicated servers, typically provided by a 3rd party service, which leads to a substantial monetary cost for every prediction. We propose a new software architecture for client-side applications, where a small local DNN is used alongside a remote large-scale model, aiming to make easy predictions locally at negligible monetary cost, while still leveraging the benefits of a large model for challenging inputs. In a proof of concept we reduce prediction cost by up to 50% without negatively impacting system accuracy.