LGJun 25, 2025
Divide, Specialize, and Route: A New Approach to Efficient Ensemble LearningJakub Piwko, Jędrzej Ruciński, Dawid Płudowski et al.
Ensemble learning has proven effective in boosting predictive performance, but traditional methods such as bagging, boosting, and dynamic ensemble selection (DES) suffer from high computational cost and limited adaptability to heterogeneous data distributions. To address these limitations, we propose Hellsemble, a novel and interpretable ensemble framework for binary classification that leverages dataset complexity during both training and inference. Hellsemble incrementally partitions the dataset into circles of difficulty by iteratively passing misclassified instances from simpler models to subsequent ones, forming a committee of specialised base learners. Each model is trained on increasingly challenging subsets, while a separate router model learns to assign new instances to the most suitable base model based on inferred difficulty. Hellsemble achieves strong classification accuracy while maintaining computational efficiency and interpretability. Experimental results on OpenML-CC18 and Tabzilla benchmarks demonstrate that Hellsemble often outperforms classical ensemble methods. Our findings suggest that embracing instance-level difficulty offers a promising direction for constructing efficient and robust ensemble systems.
LGJul 16, 2025
Are encoders able to learn landmarkers for warm-starting of Hyperparameter Optimization?Antoni Zajko, Katarzyna Woźnica
Effectively representing heterogeneous tabular datasets for meta-learning purposes is still an open problem. Previous approaches rely on representations that are intended to be universal. This paper proposes two novel methods for tabular representation learning tailored to a specific meta-task - warm-starting Bayesian Hyperparameter Optimization. Both follow the specific requirement formulated by ourselves that enforces representations to capture the properties of landmarkers. The first approach involves deep metric learning, while the second one is based on landmarkers reconstruction. We evaluate the proposed encoders in two ways. Next to the gain in the target meta-task, we also use the degree of fulfillment of the proposed requirement as the evaluation metric. Experiments demonstrate that while the proposed encoders can effectively learn representations aligned with landmarkers, they may not directly translate to significant performance gains in the meta-task of HPO warm-starting.
LGMar 7, 2024
Rethinking of Encoder-based Warm-start Methods in Hyperparameter OptimizationDawid Płudowski, Antoni Zajko, Anna Kozak et al.
Effectively representing heterogeneous tabular datasets for meta-learning purposes remains an open problem. Previous approaches rely on predefined meta-features, for example, statistical measures or landmarkers. The emergence of dataset encoders opens new possibilities for the extraction of meta-features because they do not involve any handmade design. Moreover, they are proven to generate dataset representations with desired spatial properties. In this research, we evaluate an encoder-based approach to one of the most established meta-tasks - warm-starting of the Bayesian Hyperparameter Optimization. To broaden our analysis we introduce a new approach for representation learning on tabular data based on [Tomoharu Iwata and Atsutoshi Kumagai. Meta-learning from Tasks with Heterogeneous Attribute Spaces. In Advances in Neural Information Processing Systems, 2020]. The validation on over 100 datasets from UCI and an independent metaMIMIC set of datasets highlights the nuanced challenges in representation learning. We show that general representations may not suffice for some meta-tasks where requirements are not explicitly considered during extraction.