Fast Inference of Tree Ensembles on ARM Devices
This work addresses the need for fast inference of tree ensembles on ARM-based IoT devices, though it is incremental as it adapts existing methods to a new hardware context.
The paper tackled the problem of efficiently running tree ensemble models on ARM devices, which are common in IoT, by adapting the QuickScorer algorithm to ARM's NEON instruction set and extending it to classification models like Random Forests, achieving a speed-up of up to 9.4 times compared to a reference implementation and showing that quantized models improve speed with minimal impact on accuracy.
With the ongoing integration of Machine Learning models into everyday life, e.g. in the form of the Internet of Things (IoT), the evaluation of learned models becomes more and more an important issue. Tree ensembles are one of the best black-box classifiers available and routinely outperform more complex classifiers. While the fast application of tree ensembles has already been studied in the literature for Intel CPUs, they have not yet been studied in the context of ARM CPUs which are more dominant for IoT applications. In this paper, we convert the popular QuickScorer algorithm and its siblings from Intel's AVX to ARM's NEON instruction set. Second, we extend our implementation from ranking models to classification models such as Random Forests. Third, we investigate the effects of using fixed-point quantization in Random Forests. Our study shows that a careful implementation of tree traversal on ARM CPUs leads to a speed-up of up to 9.4 compared to a reference implementation. Moreover, quantized models seem to outperform models using floating-point values in terms of speed in almost all cases, with a neglectable impact on the predictive performance of the model. Finally, our study highlights architectural differences between ARM and Intel CPUs and between different ARM devices that imply that the best implementation depends on both the specific forest as well as the specific device used for deployment.