Predictive Ensemble Learning with Application to Scene Text Detection
This addresses the problem of ensemble learning for complex tasks like object detection, where traditional methods fail, offering a novel solution for researchers and practitioners in computer vision.
The paper tackles the difficulty of applying ensemble learning to tasks like object detection where outputs vary in quantity and are not directly comparable, by proposing Predictive Ensemble Learning (PEL) to predict the best-performing model per test example, transforming it into a classification task. In scene text detection, PEL significantly improves performance compared to individual state-of-the-art models or fusion by non-maximum suppression.
Deep learning based approaches have achieved significant progresses in different tasks like classification, detection, segmentation, and so on. Ensemble learning is widely known to further improve performance by combining multiple complementary models. It is easy to apply ensemble learning for classification tasks, for example, based on averaging, voting, or other methods. However, for other tasks (like object detection) where the outputs are varying in quantity and unable to be simply compared, the ensemble of multiple models become difficult. In this paper, we propose a new method called Predictive Ensemble Learning (PEL), based on powerful predictive ability of deep neural networks, to directly predict the best performing model among a pool of base models for each test example, thus transforming ensemble learning to a traditional classification task. Taking scene text detection as the application, where no suitable ensemble learning strategy exists, PEL can significantly improve the performance, compared to either individual state-of-the-art models, or the fusion of multiple models by non-maximum suppression. Experimental results show the possibility and potential of PEL in predicting different models' performance based only on a query example, which can be extended for ensemble learning in many other complex tasks.