LGMLFeb 25, 2020

Analysis of Discriminator in RKHS Function Space for Kullback-Leibler Divergence Estimation

arXiv:2002.11187v43 citations
AI Analysis

This work addresses a stability problem in KL divergence estimation for large-scale machine learning applications, offering a theoretical foundation and practical solution, though it is incremental as it builds on existing GAN-based approaches.

The paper tackles the instability in scalable sample-based methods for estimating Kullback-Leibler (KL) divergence by analyzing a generative adversarial network approach, attributing high fluctuations to uncontrolled discriminator complexity and proposing a theoretical remedy using a discriminator in Reproducing Kernel Hilbert Space (RKHS) with controlled experiments to support the method.

Several scalable sample-based methods to compute the Kullback Leibler (KL) divergence between two distributions have been proposed and applied in large-scale machine learning models. While they have been found to be unstable, the theoretical root cause of the problem is not clear. In this paper, we study a generative adversarial network based approach that uses a neural network discriminator to estimate KL divergence. We argue that, in such case, high fluctuations in the estimates are a consequence of not controlling the complexity of the discriminator function space. We provide a theoretical underpinning and remedy for this problem by first constructing a discriminator in the Reproducing Kernel Hilbert Space (RKHS). This enables us to leverage sample complexity and mean embedding to theoretically relate the error probability bound of the KL estimates to the complexity of the discriminator in RKHS. Based on this theory, we then present a scalable way to control the complexity of the discriminator for a reliable estimation of KL divergence. We support both our proposed theory and method to control the complexity of the RKHS discriminator through controlled experiments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes