On the Use of Minimum Penalties in Statistical Learning
This work addresses a scaling issue in statistical learning for multivariate regression, offering a method that extends to other models like binomial responses, but it is incremental as it builds on existing state-of-the-art techniques.
The authors tackled the problem of estimating regression coefficients and relationships between outcome variables in multivariate models, proposing the MinPEN framework which uses a novel minimum function penalty and an iterative algorithm, achieving high-dimensional convergence rates and model selection consistency.
Modern multivariate machine learning and statistical methodologies estimate parameters of interest while leveraging prior knowledge of the association between outcome variables. The methods that do allow for estimation of relationships do so typically through an error covariance matrix in multivariate regression which does not scale to other types of models. In this article we proposed the MinPEN framework to simultaneously estimate regression coefficients associated with the multivariate regression model and the relationships between outcome variables using mild assumptions. The MinPen framework utilizes a novel penalty based on the minimum function to exploit detected relationships between responses. An iterative algorithm that generalizes current state of the art methods is proposed as a solution to the non-convex optimization that is required to obtain estimates. Theoretical results such as high dimensional convergence rates, model selection consistency, and a framework for post selection inference are provided. We extend the proposed MinPen framework to other exponential family loss functions, with a specific focus on multiple binomial responses. Tuning parameter selection is also addressed. Finally, simulations and two data examples are presented to show the finite sample properties of this framework.