Approximation Algorithms for Cascading Prediction Models
This addresses efficiency issues for users deploying machine learning models in resource-constrained environments, but it is incremental as it builds on existing pre-trained models and cascading techniques.
The paper tackles the problem of reducing computational cost in cascaded prediction models while maintaining accuracy, achieving up to a 2x reduction in floating point multiplications and a 6x reduction in average-case memory I/O for ImageNet classification.
We present an approximation algorithm that takes a pool of pre-trained models as input and produces from it a cascaded model with similar accuracy but lower average-case cost. Applied to state-of-the-art ImageNet classification models, this yields up to a 2x reduction in floating point multiplications, and up to a 6x reduction in average-case memory I/O. The auto-generated cascades exhibit intuitive properties, such as using lower-resolution input for easier images and requiring higher prediction confidence when using a computationally cheaper model.