Sara Gjorgjieva

29.3NEMay 27

On the Structural (Dis)Agreement of Landscape Representations in Black-Box Optimization

Sara Gjorgjieva, Eva Tuba, Barbara Koroušić Seljak et al.

Landscape feature representations play a central role in automated algorithm selection and meta-learning for black-box optimization, yet little is known about how different representations agree (or disagree) in the structures they impose on problem spaces. This paper presents a systematic unsupervised evaluation of four state-of-the-art representations (ELA, DeepELA, TransOptAS, and DoE2Vec) using a diverse set of affine combinations of BBOB functions (MA-BBOB). By applying extensive clustering analyses, coverage-based stability measures, and cross-representation similarity assessments, we show that each representation organizes the same problems in markedly different ways: ELA and TransOptAS form compact geometric structures, DeepELA provides a balanced intermediate view, and DoE2Vec achieves strong semantic alignment but with substantial fragmentation. Our results reveal that no single representation dominates; rather, they capture complementary aspects of the underlying landscapes. These findings highlight the importance of multi-view analyses for understanding representation behavior and offer guidance on selecting or combining representations in downstream meta-learning and algorithm selection tasks. In addition, across two different algorithm families (Differential Evolution and Particle Swarm Optimization), we show that landscape representations face an inherent trade-off in how well they align structural landscape descriptions with observed performance, indicating that no single representation can fully capture algorithm performance.

2.6LGMay 27

Learning to Assess the Reliability of Number-of-Runs Estimation in Stochastic Optimization

Sara Gjorgjieva, Eva Tuba, Tome Eftimov

In large-scale benchmarking of stochastic optimization algorithms, the key challenge is no longer whether repeated runs are needed for reliability, but how to determine when sufficient evidence has been collected without incurring unnecessary computational cost. We study a learning-based extension of a recent empirical online heuristic that adaptively estimates the required number of runs using outlier handling and skewness-based symmetry checks. Using annotated outcomes from 132{,}000 Nevergrad runs on COCO (24 problems in 20 dimensions, 10 instances each, 11 optimizers), we train classifiers on 23 statistical, energy-free, and shape and stability features to predict whether a run-number estimate is reliable, prioritizing detection of incorrect estimates via minority-class recall. We evaluate reliability prediction using a within-configuration learning setup, where models are trained and tested on data sharing the same optimizer. The results show that run-number reliability can be learned in a within-configuration scenario, enabling detection of unreliable estimates with high minority-class recall, although performance remains limited by the restricted data diversity within fixed configurations.

Sara Gjorgjieva

2 Papers