Proximal Iteration for Nonlinear Adaptive Lasso
This work addresses bias in variable selection for complex models, offering a more flexible approach to sparsity, though it is incremental as it builds on existing Adaptive Lasso and proximal gradient methods.
The paper tackled the problem of bias in Lasso estimation by proposing a method to learn penalty coefficients jointly with model parameters, reducing bias and enabling arbitrary sparsity structures. It demonstrated competitive speed and accuracy on synthetic and real datasets, including applications to COVID-19 vaccination and refugee movement.
Augmenting a smooth cost function with an $\ell_1$ penalty allows analysts to efficiently conduct estimation and variable selection simultaneously in sophisticated models and can be efficiently implemented using proximal gradient methods. However, one drawback of the $\ell_1$ penalty is bias: nonzero parameters are underestimated in magnitude, motivating techniques such as the Adaptive Lasso which endow each parameter with its own penalty coefficient. But it's not clear how these parameter-specific penalties should be set in complex models. In this article, we study the approach of treating the penalty coefficients as additional decision variables to be learned in a \textit{Maximum a Posteriori} manner, developing a proximal gradient approach to joint optimization of these together with the parameters of any differentiable cost function. Beyond reducing bias in estimates, this procedure can also encourage arbitrary sparsity structure via a prior on the penalty coefficients. We compare our method to implementations of specific sparsity structures for non-Gaussian regression on synthetic and real datasets, finding our more general method to be competitive in terms of both speed and accuracy. We then consider nonlinear models for two case studies: COVID-19 vaccination behavior and international refugee movement, highlighting the applicability of this approach to complex problems and intricate sparsity structures.