15.7PRApr 11
Empirical Measures and Strong Laws of Large Numbers in Categorical ProbabilityTobias Fritz, Tomáš Gonda, Antonio Lorenzin et al.
The Glivenko--Cantelli theorem is a uniform version of the strong law of large numbers. It states that for every IID sequence of random variables, the empirical measure converges to the underlying distribution (in the sense of uniform convergence of the CDF). In this work, we provide tools to study such limits of empirical measures in categorical probability. We propose two axioms, namely permutation invariance and empirical adequacy, that a morphism of type $X^{\mathbb{N}} \to X$ should satisfy to be interpretable as taking an infinite sequence as input and producing a sample from its empirical measure as output. Since not all sequences have a well-defined empirical measure, such \emph{empirical sampling morphisms} live in quasi-Markov categories, which, unlike Markov categories, allow for partial morphisms. Given an empirical sampling morphism and a few other properties, we prove representability as well as abstract versions of the de Finetti theorem, the Glivenko--Cantelli theorem and the strong law of large numbers. We provide several concrete constructions of empirical sampling morphisms as partially defined Markov kernels on standard Borel spaces. Instantiating our abstract results then recovers the standard Glivenko--Cantelli theorem and the strong law of large numbers for random variables with finite first moment. Our work thus provides a joint proof of these two theorems in conjunction with the de Finetti theorem from first principles.
20.7CTMay 14
Approaching the Continuous from the Discrete: an Infinite Tensor Product ConstructionAntonio Lorenzin, Fabio Zanasi
Increasingly in recent years, probabilistic computation has been investigated through the lenses of categorical algebra, especially via string diagrammatic calculi. Whereas categories of discrete and Gaussian probabilistic processes have been thoroughly studied, with various axiomatisation results, more expressive classes of continuous probability are less understood, because of the intrinsic difficulty of describing infinite behaviour by algebraic means. In this work, we establish a universal construction that adjoins infinite tensor products, allowing continuous probability to be investigated from discrete settings. Our main result applies this construction to $\mathsf{FinStoch}$, the category of finite sets and stochastic matrices, obtaining a category of locally constant Markov kernels, where the objects are finite sets plus the Cantor space $2^{\mathbb{N}}$. Any probability measure on the reals can be reasoned about in this category. Furthermore, we show how to lift axiomatisation results through the infinite tensor product construction. This way we obtain an axiomatic presentation of continuous probability over countable powers of $2=\lbrace 0,1\rbrace$.
AIDec 10, 2025
Bayesian Networks, Markov Networks, Moralisation, Triangulation: a Categorical PerspectiveAntonio Lorenzin, Fabio Zanasi
Moralisation and Triangulation are transformations allowing to switch between different ways of factoring a probability distribution into a graphical model. Moralisation allows to view a Bayesian network (a directed model) as a Markov network (an undirected model), whereas triangulation addresses the opposite direction. We present a categorical framework where these transformations are modelled as functors between a category of Bayesian networks and one of Markov networks. The two kinds of network (the objects of these categories) are themselves represented as functors from a `syntax' domain to a `semantics' codomain. Notably, moralisation and triangulation can be defined inductively on such syntax via functor pre-composition. Moreover, while moralisation is fully syntactic, triangulation relies on semantics. This leads to a discussion of the variable elimination algorithm, reinterpreted here as a functor in its own right, that splits the triangulation procedure in two: one purely syntactic, the other purely semantic. This approach introduces a functorial perspective into the theory of probabilistic graphical models, which highlights the distinctions between syntactic and semantic modifications.
AIMar 14, 2025
An Algebraic Approach to Moralisation and Triangulation of Probabilistic Graphical ModelsAntonio Lorenzin, Fabio Zanasi
Moralisation and Triangulation are transformations allowing to switch between different ways of factoring a probability distribution into a graphical model. Moralisation allows to view a Bayesian network (a directed model) as a Markov network (an undirected model), whereas triangulation works in the opposite direction. We present a categorical framework where these transformations are modelled as functors between a category of Bayesian networks and one of Markov networks. The two kinds of network (the objects of these categories) are themselves represented as functors, from a `syntax' domain to a `semantics' codomain. Notably, moralisation and triangulation are definable inductively on such syntax, and operate as a form of functor pre-composition. This approach introduces a modular, algebraic perspective in the theory of probabilistic graphical models.