The Landmark Selection Method for Multiple Output Prediction
This addresses the challenge of high-dimensional output prediction in machine learning, offering a novel approach that improves accuracy in tasks like multi-label classification.
The paper tackles the problem of conditional modeling with high-dimensional outputs by selecting a small subset of output dimensions and modeling them in two stages, achieving better performance than one-vs-all and other methods in multi-label classification and multivariate regression experiments.
Conditional modeling x \to y is a central problem in machine learning. A substantial research effort is devoted to such modeling when x is high dimensional. We consider, instead, the case of a high dimensional y, where x is either low dimensional or high dimensional. Our approach is based on selecting a small subset y_L of the dimensions of y, and proceed by modeling (i) x \to y_L and (ii) y_L \to y. Composing these two models, we obtain a conditional model x \to y that possesses convenient statistical properties. Multi-label classification and multivariate regression experiments on several datasets show that this model outperforms the one vs. all approach as well as several sophisticated multiple output prediction methods.