Open Set Recognition for Random Forest
This addresses the issue of incomplete training data in real-world classification tasks, enabling random forest classifiers to handle unknown classes, though it is incremental as it builds on existing distance-based open-set recognition methods.
The paper tackles the problem of open-set recognition for random forest classifiers, which typically fail to identify samples from unknown classes, by proposing a novel approach that incorporates distance metric learning and distance-based methods, resulting in outperforming state-of-the-art methods on synthetic and real-world datasets.
In many real-world classification or recognition tasks, it is often difficult to collect training examples that exhaust all possible classes due to, for example, incomplete knowledge during training or ever changing regimes. Therefore, samples from unknown/novel classes may be encountered in testing/deployment. In such scenarios, the classifiers should be able to i) perform classification on known classes, and at the same time, ii) identify samples from unknown classes. This is known as open-set recognition. Although random forest has been an extremely successful framework as a general-purpose classification (and regression) method, in practice, it usually operates under the closed-set assumption and is not able to identify samples from new classes when run out of the box. In this work, we propose a novel approach to enabling open-set recognition capability for random forest classifiers by incorporating distance metric learning and distance-based open-set recognition. The proposed method is validated on both synthetic and real-world datasets. The experimental results indicate that the proposed approach outperforms state-of-the-art distance-based open-set recognition methods.