Unsupervised Evaluation and Weighted Aggregation of Ranked Predictions
This addresses the challenge of applying ensemble methods in real-world scenarios where labeled data is unavailable, offering a solution for unsupervised learning tasks.
The paper tackles the problem of ensemble learning in binary classification without labeled data by developing SUMMA, a theoretical framework that estimates base classifier performances and determines an optimal aggregation strategy, achieving results comparable to supervised methods.
Learning algorithms that aggregate predictions from an ensemble of diverse base classifiers consistently outperform individual methods. Many of these strategies have been developed in a supervised setting, where the accuracy of each base classifier can be empirically measured and this information is incorporated in the training process. However, the reliance on labeled data precludes the application of ensemble methods to many real world problems where labeled data has not been curated. To this end we developed a new theoretical framework for binary classification, the Strategy for Unsupervised Multiple Method Aggregation (SUMMA), to estimate the performances of base classifiers and an optimal strategy for ensemble learning from unlabeled data.