Structure-aware divergences for comparing probability distributions
This work addresses the need for more efficient and accurate comparisons of probability distributions in fields like economics and ecology, where element relationships are crucial, though it is incremental as it builds on existing divergence frameworks.
The authors tackled the problem of comparing probability distributions over structured domains, where standard divergences ignore element similarities, by introducing structure-aware divergences that incorporate domain geometry. They demonstrated that these divergences recover planted patterns in synthetic clustering tasks more effectively than conventional methods and are orders of magnitude faster than optimal transport distances, with applications in economic geography and ecology.
Many natural and social science systems are described using probability distributions over elements that are related to each other: for instance, occupations with shared skills or species with similar traits. Standard information theory quantities such as entropies and $f$-divergences treat elements interchangeably and are blind to the similarity structure. We introduce a family of divergences that are sensitive to the geometry of the underlying domain. By virtue of being the Bregman divergences of structure-aware entropies, they provide a framework that retains several advantages of Kullback-Leibler divergence and Shannon entropy. Structure-aware divergences recover planted patterns in a synthetic clustering task that conventional divergences miss and are orders of magnitude faster than optimal transport distances. We demonstrate their applicability in economic geography and ecology, where structure plays an important role. Modelling different notions of occupation relatedness yields qualitatively different regionalisations of their geographic distribution. Our methods also reproduce established insights into functional $β$-diversity in ecology obtained with optimal transport methods.