A kernel Stein test of goodness of fit for sequential models
This addresses a specific problem for researchers modeling sequential data, but it is incremental as it builds on existing KSD methods.
The authors tackled the problem of evaluating goodness-of-fit for probability densities with varying dimensionality, such as text or sequences, by extending the kernel Stein discrepancy (KSD) to variable dimensions and proposing a novel test. The test performs well on discrete sequential benchmarks, though no concrete numbers are provided.
We propose a goodness-of-fit measure for probability densities modeling observations with varying dimensionality, such as text documents of differing lengths or variable-length sequences. The proposed measure is an instance of the kernel Stein discrepancy (KSD), which has been used to construct goodness-of-fit tests for unnormalized densities. The KSD is defined by its Stein operator: current operators used in testing apply to fixed-dimensional spaces. As our main contribution, we extend the KSD to the variable-dimension setting by identifying appropriate Stein operators, and propose a novel KSD goodness-of-fit test. As with the previous variants, the proposed KSD does not require the density to be normalized, allowing the evaluation of a large class of models. Our test is shown to perform well in practice on discrete sequential data benchmarks.