SoftPipe: A Soft-Guided Reinforcement Learning Framework for Automated Data Preparation
This work addresses the foundational challenge of data preparation in machine learning, offering a novel framework that could enhance efficiency and quality for practitioners, though it appears incremental as it builds on existing RL approaches.
The paper tackled the problem of automated data preparation by addressing the limitations of rigid constraints in reinforcement learning methods, introducing SoftPipe which uses soft guidance and Bayesian inference to improve exploration. The result was up to a 13.9% improvement in pipeline quality and 2.8x faster convergence on 18 datasets.
Data preparation is a foundational yet notoriously challenging component of the machine learning lifecycle, characterized by a vast combinatorial search space. While reinforcement learning (RL) offers a promising direction, state-of-the-art methods suffer from a critical limitation: to manage the search space, they rely on rigid ``hard constraints'' that prematurely prune the search space and often preclude optimal solutions. To address this, we introduce SoftPipe, a novel RL framework that replaces these constraints with a flexible ``soft guidance'' paradigm. SoftPipe formulates action selection as a Bayesian inference problem. A high-level strategic prior, generated by a Large Language Model (LLM), probabilistically guides exploration. This prior is combined with empirical estimators from two sources through a collaborative process: a fine-grained quality score from a supervised Learning-to-Rank (LTR) model and a long-term value estimate from the agent's Q-function. Through extensive experiments on 18 diverse datasets, we demonstrate that SoftPipe achieves up to a 13.9\% improvement in pipeline quality and 2.8$\times$ faster convergence compared to existing methods.