Multi-Attribute Graph Estimation with Sparse-Group Non-Convex Penalties
This work addresses graph estimation for high-dimensional multi-attribute data, offering a theoretical and practical framework that is incremental over existing single-attribute methods.
The paper tackles the problem of inferring conditional independence graphs from multi-attribute data using penalized log-likelihood with sparse-group non-convex penalties, showing that the sparse-group log-sum penalty significantly outperforms lasso and SCAD penalties in synthetic data with improved F1-scores and Hamming distances.
We consider the problem of inferring the conditional independence graph (CIG) of high-dimensional Gaussian vectors from multi-attribute data. Most existing methods for graph estimation are based on single-attribute models where one associates a scalar random variable with each node. In multi-attribute graphical models, each node represents a random vector. In this paper we provide a unified theoretical analysis of multi-attribute graph learning using a penalized log-likelihood objective function. We consider both convex (sparse-group lasso) and sparse-group non-convex (log-sum and smoothly clipped absolute deviation (SCAD) penalties) penalty/regularization functions. An alternating direction method of multipliers (ADMM) approach coupled with local linear approximation to non-convex penalties is presented for optimization of the objective function. For non-convex penalties, theoretical analysis establishing local consistency in support recovery, local convexity and precision matrix estimation in high-dimensional settings is provided under two sets of sufficient conditions: with and without some irrepresentability conditions. We illustrate our approaches using both synthetic and real-data numerical examples. In the synthetic data examples the sparse-group log-sum penalized objective function significantly outperformed the lasso penalized as well as SCAD penalized objective functions with $F_1$-score and Hamming distance as performance metrics.