LG AI MLSep 13, 2025

FACTORS: Factorial Approximation for Complementary Two-factor Optimization with Risk-aware Scoring

arXiv:2509.10825v1h-index: 1

Originality Incremental advance

AI Analysis

This work addresses the problem of reliable configuration selection for machine learning practitioners under budget constraints, offering an incremental improvement by integrating existing methods like design of experiments and Shapley decomposition into a novel framework.

The paper tackles the problem of optimizing performance and stability in machine learning configurations under budget constraints by proposing FACTORS, a framework that combines design of experiments with Shapley decomposition to estimate main effects and two-factor interactions, integrating them into a risk-adjusted objective function. It improves rank preservation and optimal configuration identification, reduces decision-making risks, and delivers interpretable justification with stable performance gains across diverse datasets and design conditions.

We propose FACTORS, a framework that combines design of experiments with Shapley decomposition to address performance and stability issues that are sensitive to combinations of training factors. Our approach consistently estimates main effects and two-factor interactions, then integrates them into a risk-adjusted objective function that jointly accounts for uncertainty and cost, enabling reliable selection of configurations under a fixed budget. Effect estimation is implemented through two complementary paths: a plug-in path based on conditional means, and a least-squares path that reconstructs Shapley contributions from samples. These paths are designed to work complementarily even when design density and bias levels differ. By incorporating standardization of estimates, bias correction, and uncertainty quantification, our procedure ensures comparability across heterogeneous factor spaces and designs, while a lightweight search routine yields configurations within practical time even for large factor spaces. On the theoretical side, we provide error decompositions, sample complexity analysis, and upper bounds on optimality gaps. On the interpretive side, we summarize main effects and interactions in map form, highlighting adjustment priorities and safe improvement pathways. Across diverse datasets and design conditions, our approach improves rank preservation and optimal configuration identification, reduces decision-making risks, and offers a tuning foundation that delivers interpretable justification alongside stable performance gains even under budget constraints.

View on arXiv PDF

Similar