LGAug 19, 2025

How Usable is Automated Feature Engineering for Tabular Data?

arXiv:2508.13932v11 citationsh-index: 8
Originality Synthesis-oriented
AI Analysis

This addresses the problem of inefficient and inaccessible AutoFE tools for practitioners, highlighting an incremental gap in usability rather than performance.

The paper investigated the usability of 53 automated feature engineering (AutoFE) methods for tabular data, finding that they are generally hard to use, lack documentation, have no active communities, and do not allow setting time and memory constraints.

Tabular data, consisting of rows and columns, is omnipresent across various machine learning applications. Each column represents a feature, and features can be combined or transformed to create new, more informative features. Such feature engineering is essential to achieve peak performance in machine learning. Since manual feature engineering is expensive and time-consuming, a substantial effort has been put into automating it. Yet, existing automated feature engineering (AutoFE) methods have never been investigated regarding their usability for practitioners. Thus, we investigated 53 AutoFE methods. We found that these methods are, in general, hard to use, lack documentation, and have no active communities. Furthermore, no method allows users to set time and memory constraints, which we see as a necessity for usable automation. Our survey highlights the need for future work on usable, well-engineered AutoFE methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes