FoMo-0D: A Foundation Model for Zero-shot Tabular Outlier Detection
This addresses the bottleneck of unsupervised algorithm and hyperparameter selection in outlier detection for practical applications, representing a novel approach rather than an incremental improvement.
The paper tackles the problem of model selection in unsupervised outlier detection by introducing FoMo-0D, a pre-trained foundation model for zero-shot tabular outlier detection, which outperforms most of 26 baselines on 57 datasets with no statistically significant difference from the second-best method and achieves a 7x speed-up in inference time.
Outlier detection (OD) has a vast literature as it finds numerous real-world applications. Being an unsupervised task, model selection is a key bottleneck for OD without label supervision. Despite a long list of available OD algorithms with tunable hyperparameters, the lack of systematic approaches for unsupervised algorithm and hyperparameter selection limits their effective use in practice. In this paper, we present FoMo-0D, a pre-trained Foundation Model for zero/0-shot OD on tabular data, which bypasses the hurdle of model selection altogether. Having been pre-trained on synthetic data, FoMo-0D can directly predict the (outlier/inlier) label of test samples without parameter fine-tuning -- requiring no labeled data, and no additional training or hyperparameter tuning when given a new task. Extensive experiments on 57 real-world datasets against 26 baselines show that FoMo-0D is highly competitive; outperforming the majority of the baselines with no statistically significant difference from the 2nd best method. Further, FoMo-0D is efficient in inference time requiring only 7.7 ms per sample on average, with at least 7x speed-up compared to previous methods. To facilitate future research, our implementations for data synthesis and pre-training as well as model checkpoints are openly available at https://github.com/A-Chicharito-S/FoMo-0D.