Toward Universal Laws of Outlier Propagation
This work provides a foundational framework for outlier detection and attribution in causal networks, which is incremental as it extends existing laws of randomness.
The paper tackles the problem of unifying outlier detection across different anomalous features by using Algorithmic Information Theory to attribute outliers to root causes in causal mechanisms, showing that randomness deficiency decomposes and weak outliers cannot cause strong ones.
When a variety of anomalous features motivate flagging different samples as outliers, Algorithmic Information Theory (AIT) offers a principled way to unify them in terms of a sample's randomness deficiency. Subject to the algorithmic Markov condition on a causal Bayesian network, we show that the randomness deficiency of a joint sample decomposes into a sum of randomness deficiencies at each causal mechanism. Consequently, anomalous observations can be attributed to their root causes, i.e., the mechanisms that behaved anomalously. As an extension of Levin's law of randomness conservation, we show that weak outliers cannot cause strong ones. We show how these information theoretic laws clarify our understanding of outlier detection and attribution, in the context of more specialized outlier scores from prior literature.