Active Learning of SVDD Hyperparameter Values
This work addresses a critical bottleneck in outlier detection for practitioners by offering a more reliable and automated alternative to existing heuristic hyperparameter selection methods, though it is incremental as it builds on kernel alignment and active learning concepts.
The paper tackles the problem of selecting hyperparameter values for Support Vector Data Description (SVDD) in outlier detection, which is currently addressed by heuristic methods with unclear conditions. The proposed LAMA method provides principled, evidence-based estimates for both SVDD hyperparameters, outperforming state-of-the-art competitors in experiments and achieving results close to the empirical upper bound in several cases.
Support Vector Data Description is a popular method for outlier detection. However, its usefulness largely depends on selecting good hyperparameter values -- a difficult problem that has received significant attention in literature. Existing methods to estimate hyperparameter values are purely heuristic, and the conditions under which they work well are unclear. In this article, we propose LAMA (Local Active Min-Max Alignment), the first principled approach to estimate SVDD hyperparameter values by active learning. The core idea bases on kernel alignment, which we adapt to active learning with small sample sizes. In contrast to many existing approaches, LAMA provides estimates for both SVDD hyperparameters. These estimates are evidence-based, i.e., rely on actual class labels, and come with a quality score. This eliminates the need for manual validation, an issue with current heuristics. LAMA outperforms state-of-the-art competitors in extensive experiments on real-world data. In several cases, LAMA even yields results close to the empirical upper bound.