ML LG COFeb 12, 2024

Scalable Structure Learning for Sparse Context-Specific Systems

Felix Leopoldo Rios, Alex Markham, Liam Solus

arXiv:2402.07762v27.52 citationsh-index: 12Has Code

Originality Incremental advance

AI Analysis

This work addresses scalability in structure learning for context-specific systems, which is important for researchers and practitioners in machine learning and statistics dealing with high-dimensional categorical data, though it appears incremental as it builds on existing Markov chain Monte-Carlo methods with a novel sparsity assumption.

The paper tackles the problem of learning context-specific graphical models for categorical variables, which is computationally challenging due to scalability issues in existing methods. It presents an algorithm that scales to hundreds of variables, achieving good performance on synthetic and real-world data in terms of accuracy and scalability.

Several approaches to graphically representing context-specific relations among jointly distributed categorical variables have been proposed, along with structure learning algorithms. While existing optimization-based methods have limited scalability due to the large number of context-specific models, the constraint-based methods are more prone to error than even constraint-based directed acyclic graph learning algorithms since more relations must be tested. We present an algorithm for learning context-specific models that scales to hundreds of variables. Scalable learning is achieved through a combination of an order-based Markov chain Monte-Carlo search and a novel, context-specific sparsity assumption that is analogous to those typically invoked for directed acyclic graphical models. Unlike previous Markov chain Monte-Carlo search methods, our Markov chain is guaranteed to have the true posterior of the variable orderings as the stationary distribution. To implement the method, we solve a first case of an open problem recently posed by Alon and Balogh. Future work solving increasingly general instances of this problem would allow our methods to learn increasingly dense models. The method is shown to perform well on synthetic data and real world examples, in terms of both accuracy and scalability.

View on arXiv PDF Code

Similar