ML LG STMay 22, 2019

Distributionally Robust Formulation and Model Selection for the Graphical Lasso

Pedro Cisneros-Velarde, Sang-Yun Oh, Alexander Petersen

arXiv:1905.08975v24.94 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of model selection in graphical lasso for multivariate data analysis, offering a more efficient and interpretable tuning method, though it is incremental as it builds on existing distributionally robust optimization frameworks.

The authors tackled the problem of selecting the regularization parameter for the graphical lasso estimator by developing a distributionally robust optimization framework with a tailored Wasserstein ambiguity set, resulting in a simple algorithm that avoids computationally expensive repeated evaluations and provides a closed-form expression for robustness quantification.

Building on a recent framework for distributionally robust optimization, we consider estimation of the inverse covariance matrix for multivariate data. We provide a novel notion of a Wasserstein ambiguity set specifically tailored to this estimation problem, leading to a tractable class of regularized estimators. Special cases include penalized likelihood estimators for Gaussian data, specifically the graphical lasso estimator. As a consequence of this formulation, the radius of the Wasserstein ambiguity set is directly related to the regularization parameter in the estimation problem. Using this relationship, the level of robustness of the estimation procedure can be shown to correspond to the level of confidence with which the ambiguity set contains a distribution with the population covariance. Furthermore, a unique feature of our formulation is that the radius can be expressed in closed-form as a function of the ordinary sample covariance matrix. Taking advantage of this finding, we develop a simple algorithm to determine a regularization parameter for graphical lasso, using only the bootstrapped sample covariance matrices, meaning that computationally expensive repeated evaluation of the graphical lasso algorithm is not necessary. Alternatively, the distributionally robust formulation can also quantify the robustness of the corresponding estimator if one uses an off-the-shelf method such as cross-validation. Finally, we numerically study the obtained regularization criterion and analyze the robustness of other automated tuning procedures used in practice.

View on arXiv PDF

Similar