A convex formulation for high-dimensional sparse sliced inverse regression
This is an incremental improvement for statisticians and data scientists working with high-dimensional data, as it enhances an existing method by adding sparsity for better interpretability.
The paper tackles the problem of interpretability and high variability in sliced inverse regression when dealing with many covariates by proposing a convex formulation for sparse sliced inverse regression in high dimensions, which simultaneously estimates the subspace and performs variable selection, and shows through numerical studies that it can identify correct covariates.
Sliced inverse regression is a popular tool for sufficient dimension reduction, which replaces covariates with a minimal set of their linear combinations without loss of information on the conditional distribution of the response given the covariates. The estimated linear combinations include all covariates, making results difficult to interpret and perhaps unnecessarily variable, particularly when the number of covariates is large. In this paper, we propose a convex formulation for fitting sparse sliced inverse regression in high dimensions. Our proposal estimates the subspace of the linear combinations of the covariates directly and performs variable selection simultaneously. We solve the resulting convex optimization problem via the linearized alternating direction methods of multiplier algorithm, and establish an upper bound on the subspace distance between the estimated and the true subspaces. Through numerical studies, we show that our proposal is able to identify the correct covariates in the high-dimensional setting.