Data Obsolescence Detection in the Light of Newly Acquired Valid Observations
This addresses data obsolescence for database management in uncertain environments, such as healthcare, but is incremental as it builds on existing Bayesian network methods.
The paper tackles the problem of detecting obsolete information in databases by introducing a real-time method to identify contradictions using a Bayesian network and a new ε-Contradiction concept, demonstrating effectiveness on an elderly fall-prevention database with systematically very good results.
The information describing the conditions of a system or a person is constantly evolving and may become obsolete and contradict other information. A database, therefore, must be consistently updated upon the acquisition of new valid observations that contradict obsolete ones contained in the database. In this paper, we propose a novel approach for dealing with the information obsolescence problem. Our approach aims to detect, in real-time, contradictions between observations and then identify the obsolete ones, given a representation model. Since we work within an uncertain environment characterized by the lack of information, we choose to use a Bayesian network as our representation model and propose a new approximate concept, $ε$-Contradiction. The new concept is parameterised by a confidence level of having a contradiction in a set of observations. We propose a polynomial-time algorithm for detecting obsolete information. We show that the resulting obsolete information is better represented by an AND-OR tree than a simple set of observations. Finally, we demonstrate the effectiveness of our approach on a real elderly fall-prevention database and showcase how this tree can be used to give reliable recommendations to doctors. Our experiments give systematically and substantially very good results.