Random Forest for Dissimilarity-based Multi-view Learning
This work addresses multi-view learning problems where data have heterogeneous descriptions, offering an incremental improvement in view combination strategies.
The paper tackles multi-view classification by using Random Forest proximity to build dissimilarity representations and introduces Dynamic View Selection to combine only the most relevant views per instance, achieving significant performance improvements over baseline methods on real-world datasets.
Many classification problems are naturally multi-view in the sense their data are described through multiple heterogeneous descriptions. For such tasks, dissimilarity strategies are effective ways to make the different descriptions comparable and to easily merge them, by (i) building intermediate dissimilarity representations for each view and (ii) fusing these representations by averaging the dissimilarities over the views. In this work, we show that the Random Forest proximity measure can be used to build the dissimilarity representations, since this measure reflects similarities between features but also class membership. We then propose a Dynamic View Selection method to better combine the view-specific dissimilarity representations. This allows to take a decision, on each instance to predict, with only the most relevant views for that instance. Experiments are conducted on several real-world multi-view datasets, and show that the Dynamic View Selection offers a significant improvement in performance compared to the simple average combination and two state-of-the-art static view combinations.