Converting ADMM to a Proximal Gradient for Efficient Sparse Estimation
This work addresses computational bottlenecks for researchers and practitioners in machine learning and statistics dealing with sparse estimation, though it is incremental as it builds on existing methods like ADMM and proximal gradient.
The paper tackles the problem of inefficiency in solving sparse estimation problems like fused lasso and convex clustering by proposing a method to convert ADMM solutions to proximal gradient methods, resulting in significant efficiency improvements as demonstrated in numerical experiments.
In sparse estimation, such as fused lasso and convex clustering, we apply either the proximal gradient method or the alternating direction method of multipliers (ADMM) to solve the problem. It takes time to include matrix division in the former case, while an efficient method such as FISTA (fast iterative shrinkage-thresholding algorithm) has been developed in the latter case. This paper proposes a general method for converting the ADMM solution to the proximal gradient method, assuming that assumption that the derivative of the objective function is Lipschitz continuous. Then, we apply it to sparse estimation problems, such as sparse convex clustering and trend filtering, and we show by numerical experiments that we can obtain a significant improvement in terms of efficiency.