LGJun 6, 2024

Learned Feature Importance Scores for Automated Feature Engineering

arXiv:2406.04153v10.00
AI Analysis50

This addresses the problem of manual feature engineering for machine learning practitioners, offering an automated solution that improves efficiency and model performance, though it appears incremental as it builds on existing automated feature engineering concepts.

The paper tackles automated feature engineering by proposing AutoMAN, a framework that learns feature importance masks to explore transform spaces without explicitly generating transformed features, achieving state-of-the-art performance with significantly lower latency compared to alternatives.

Feature engineering has demonstrated substantial utility for many machine learning workflows, such as in the small data regime or when distribution shifts are severe. Thus automating this capability can relieve much manual effort and improve model performance. Towards this, we propose AutoMAN, or Automated Mask-based Feature Engineering, an automated feature engineering framework that achieves high accuracy, low latency, and can be extended to heterogeneous and time-varying data. AutoMAN is based on effectively exploring the candidate transforms space, without explicitly manifesting transformed features. This is achieved by learning feature importance masks, which can be extended to support other modalities such as time series. AutoMAN learns feature transform importance end-to-end, incorporating a dataset's task target directly into feature engineering, resulting in state-of-the-art performance with significantly lower latency compared to alternatives.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes