Rethinking Pareto Frontier for Performance Evaluation of Deep Neural Networks
This work addresses the need for more comprehensive benchmarking in deep learning, though it is incremental as it builds on existing Pareto frontier concepts.
The authors tackled the problem of objectively evaluating deep learning models by proposing a multi-dimensional Pareto frontier method that incorporates variables like training cost, inference latency, and accuracy, along with a random version to handle uncertainty, applied to ImageNet models to rank models based on efficiency across different hardware.
Performance optimization of deep learning models is conducted either manually or through automatic architecture search, or a combination of both. On the other hand, their performance strongly depends on the target hardware and how successfully the models were trained. We propose to use a multi-dimensional Pareto frontier to re-define the efficiency measure of candidate deep learning models, where several variables such as training cost, inference latency, and accuracy play a relative role in defining a dominant model. Furthermore, a random version of the multi-dimensional Pareto frontier is introduced to mitigate the uncertainty of accuracy, latency, and throughput of deep learning models in different experimental setups. These two complementary methods can be combined to perform objective benchmarking of deep learning models. Our proposed method is applied to a wide range of deep image classification models trained on ImageNet data. Our method combines competing variables with stochastic nature in a single relative efficiency measure. This allows ranking deep learning models that run efficiently on different hardware, and combining inference efficiency with training efficiency objectively.