An Efficient Pseudo-likelihood Method for Sparse Binary Pairwise Markov Network Estimation
This work provides an incremental improvement for researchers and practitioners in machine learning and statistics dealing with high-dimensional discrete data modeling.
The paper tackles the problem of learning sparse binary pairwise Markov networks efficiently by formulating the L1 regularized pseudo-likelihood as a sparse multiple logistic regression, achieving substantial speedup without accuracy loss and improved stability on unbalanced high-dimensional data.
The pseudo-likelihood method is one of the most popular algorithms for learning sparse binary pairwise Markov networks. In this paper, we formulate the $L_1$ regularized pseudo-likelihood problem as a sparse multiple logistic regression problem. In this way, many insights and optimization procedures for sparse logistic regression can be applied to the learning of discrete Markov networks. Specifically, we use the coordinate descent algorithm for generalized linear models with convex penalties, combined with strong screening rules, to solve the pseudo-likelihood problem with $L_1$ regularization. Therefore a substantial speedup without losing any accuracy can be achieved. Furthermore, this method is more stable than the node-wise logistic regression approach on unbalanced high-dimensional data when penalized by small regularization parameters. Thorough numerical experiments on simulated data and real world data demonstrate the advantages of the proposed method.