AIJan 15, 2014

Optimal Value of Information in Graphical Models

arXiv:1401.3474v1129 citations

Originality Highly original

AI Analysis

This work addresses the problem of optimal observation selection for decision-makers in fields like sensor networks and healthcare, offering foundational algorithmic advances rather than incremental improvements.

The paper presents the first efficient optimal algorithms for selecting observations in probabilistic graphical models, such as Hidden Markov Models, to reduce uncertainty in decision-making tasks like sensor networks and medical testing. It also proves that optimizing value of information is NP^PP-hard for polytrees and computing common objective functions is #P-complete even for Naive Bayes models.

Many real-world decision making tasks require us to choose among several expensive observations. In a sensor network, for example, it is important to select the subset of sensors that is expected to provide the strongest reduction in uncertainty. In medical decision making tasks, one needs to select which tests to administer before deciding on the most effective treatment. It has been general practice to use heuristic-guided procedures for selecting observations. In this paper, we present the first efficient optimal algorithms for selecting observations for a class of probabilistic graphical models. For example, our algorithms allow to optimally label hidden variables in Hidden Markov Models (HMMs). We provide results for both selecting the optimal subset of observations, and for obtaining an optimal conditional observation plan. Furthermore we prove a surprising result: In most graphical models tasks, if one designs an efficient algorithm for chain graphs, such as HMMs, this procedure can be generalized to polytree graphical models. We prove that the optimizing value of information is $NP^{PP}$-hard even for polytrees. It also follows from our results that just computing decision theoretic value of information objective functions, which are commonly used in practice, is a #P-complete problem even on Naive Bayes models (a simple special case of polytrees). In addition, we consider several extensions, such as using our algorithms for scheduling observation selection for multiple sensors. We demonstrate the effectiveness of our approach on several real-world datasets, including a prototype sensor network deployment for energy conservation in buildings.

View on arXiv PDF

Similar