Feature-Budgeted Random Forest
This addresses the challenge of cost-efficient machine learning deployment for applications where acquiring features during prediction is expensive, representing an incremental improvement over existing prediction-time algorithms.
The paper tackles the problem of reducing prediction-time costs by limiting feature acquisition, proposing a novel random forest algorithm that minimizes prediction error under a user-specified average feature budget. Empirically, it demonstrates superior accuracy-cost curves on benchmark datasets compared to state-of-the-art methods.
We seek decision rules for prediction-time cost reduction, where complete data is available for training, but during prediction-time, each feature can only be acquired for an additional cost. We propose a novel random forest algorithm to minimize prediction error for a user-specified {\it average} feature acquisition budget. While random forests yield strong generalization performance, they do not explicitly account for feature costs and furthermore require low correlation among trees, which amplifies costs. Our random forest grows trees with low acquisition cost and high strength based on greedy minimax cost-weighted-impurity splits. Theoretically, we establish near-optimal acquisition cost guarantees for our algorithm. Empirically, on a number of benchmark datasets we demonstrate superior accuracy-cost curves against state-of-the-art prediction-time algorithms.