FeatGeNN: Improving Model Performance for Tabular Data with Correlation-based Feature Extraction
This addresses the challenge of computationally intensive and overfitting-prone AutoFE methods for tabular data, though it appears incremental as it adapts pooling techniques to a specific domain.
The paper tackles the problem of automated feature engineering for tabular data by proposing FeatGeNN, a convolutional method that uses correlation-based pooling to extract features, and demonstrates it outperforms existing approaches on benchmark datasets.
Automated Feature Engineering (AutoFE) has become an important task for any machine learning project, as it can help improve model performance and gain more information for statistical analysis. However, most current approaches for AutoFE rely on manual feature creation or use methods that can generate a large number of features, which can be computationally intensive and lead to overfitting. To address these challenges, we propose a novel convolutional method called FeatGeNN that extracts and creates new features using correlation as a pooling function. Unlike traditional pooling functions like max-pooling, correlation-based pooling considers the linear relationship between the features in the data matrix, making it more suitable for tabular data. We evaluate our method on various benchmark datasets and demonstrate that FeatGeNN outperforms existing AutoFE approaches regarding model performance. Our results suggest that correlation-based pooling can be a promising alternative to max-pooling for AutoFE in tabular data applications.