DLJun 3

A Note on the Kullback-Leibler Divergence in Discretized Empirical Distributions

arXiv:2606.0485226.3
Predicted impact top 66% in DL · last 90 daysOriginality Synthesis-oriented
AI Analysis

Provides a conceptual clarification for researchers using KL divergence in empirical distribution comparisons, but is incremental in nature.

This note clarifies that the sign of the Kullback-Leibler difference Δ_KL(p,q) does not indicate support inclusion or coverage, but rather reflects asymmetric probability-mass placement, illustrated with a bibliometric example on COVID-19 preprint topics.

When empirical objects are represented as discrete probability distributions, within-distribution summaries such as Shannon entropy and Hill-type diversity indices describe how probability mass is spread inside each object, while Kullback-Leibler (KL) divergence provides pairwise asymmetric information. This note focuses on the KL difference $Δ_{\mathrm{KL}}(p,q)=D_{\mathrm{KL}}(p|q)-D_{\mathrm{KL}}(q|p)$. Although $Δ_{\mathrm{KL}}$ can add information beyond within-distribution summaries and symmetric overlap, its sign does not, by itself, establish support inclusion, coverage, or breadth. It is better understood as a weighted category-wise log-ratio contrast reflecting asymmetric probability-mass placement. The point becomes clear once the definition is written out. The aim of this note is therefore to present it in a compact, example-based form, together with a descriptive bibliometric illustration based on COVID-19-related preprint-server topic distributions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes