LGAIAug 27, 2024

A Comprehensive Benchmark of Machine and Deep Learning Across Diverse Tabular Datasets

arXiv:2408.14817v120 citationsh-index: 8
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge for ML practitioners in selecting appropriate models for tabular data, but it is incremental as it builds on existing benchmarks with more extensive comparisons.

The study tackled the problem of understanding when deep learning models outperform traditional methods on tabular datasets by conducting a comprehensive benchmark across 111 datasets with 20 models, and they developed a model that predicts these scenarios with 86.1% accuracy (AUC 0.78).

The analysis of tabular datasets is highly prevalent both in scientific research and real-world applications of Machine Learning (ML). Unlike many other ML tasks, Deep Learning (DL) models often do not outperform traditional methods in this area. Previous comparative benchmarks have shown that DL performance is frequently equivalent or even inferior to models such as Gradient Boosting Machines (GBMs). In this study, we introduce a comprehensive benchmark aimed at better characterizing the types of datasets where DL models excel. Although several important benchmarks for tabular datasets already exist, our contribution lies in the variety and depth of our comparison: we evaluate 111 datasets with 20 different models, including both regression and classification tasks. These datasets vary in scale and include both those with and without categorical variables. Importantly, our benchmark contains a sufficient number of datasets where DL models perform best, allowing for a thorough analysis of the conditions under which DL models excel. Building on the results of this benchmark, we train a model that predicts scenarios where DL models outperform alternative methods with 86.1% accuracy (AUC 0.78). We present insights derived from this characterization and compare these findings to previous benchmarks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes