Selecting Diverse Models for Scientific Insight
This addresses the issue of uncovering diverse explanatory models in scientific data analysis, though it appears incremental as it builds on existing penalized regression methods.
The paper tackles the problem of model selection ignoring model uncertainty by proposing multi-model penalized regression (MMPR) to identify multiple explanatory patterns, and demonstrates its application to predict stacking fault energy from steel alloy composition.
Model selection often aims to choose a single model, assuming that the form of the model is correct. However, there may be multiple possible underlying explanatory patterns in a set of predictors that could explain a response. Model selection without regard for model uncertainty can fail to bring these patterns to light. We explore multi-model penalized regression (MMPR) to acknowledge model uncertainty in the context of penalized regression. We examine how different penalty settings can promote either shrinkage or sparsity of coefficients in separate models. The method is tuned to explicitly limit model similarity. A choice of penalty form that enforces variable selection is applied to predict stacking fault energy (SFE) from steel alloy composition. The aim is to identify multiple models with different subsets of covariates that explain a single type of response.