LG MLJun 4, 2025

Kernel conditional tests from learning-theoretic bounds

Pierre-François Massiani, Christian Fiedler, Lukas Haverbeck, Friedrich Solowjow, Sebastian Trimpe

arXiv:2506.03898v24.1h-index: 20

Originality Incremental advance

AI Analysis

This work addresses the need for robust statistical tests in conditional settings, such as process monitoring and dynamical systems comparison, offering a comprehensive framework that advances confidence bounds for vector-valued estimation, though it builds incrementally on existing kernel methods.

The paper tackles the problem of hypothesis testing on conditional probability distributions by transforming confidence bounds from kernel ridge regression into tests for conditional expectations, enabling tests of functionals like conditional moments or two-sample comparisons with high probability. It provides theoretical guarantees for infinite-dimensional outputs and non-trace-class kernels, and introduces practical bootstrapping schemes for implementation.

We propose a framework for hypothesis testing on conditional probability distributions, which we then use to construct statistical tests of functionals of conditional distributions. These tests identify the inputs where the functionals differ with high probability, and include tests of conditional moments or two-sample tests. Our key idea is to transform confidence bounds of a learning method into a test of conditional expectations. We instantiate this principle for kernel ridge regression (KRR) with subgaussian noise. An intermediate data embedding then enables more general tests -- including conditional two-sample tests -- via kernel mean embeddings of distributions. To have guarantees in this setting, we generalize existing pointwise-in-time or time-uniform confidence bounds for KRR to previously-inaccessible yet essential cases such as infinite-dimensional outputs with non-trace-class kernels. These bounds also circumvent the need for independent data, allowing for instance online sampling. To make our tests readily applicable in practice, we introduce bootstrapping schemes leveraging the parametric form of testing thresholds identified in theory to avoid tuning inaccessible parameters. We illustrate the tests on examples, including one in process monitoring and comparison of dynamical systems. Overall, our results establish a comprehensive foundation for conditional testing on functionals, from theoretical guarantees to an algorithmic implementation, and advance the state of the art on confidence bounds for vector-valued least squares estimation.

View on arXiv PDF

Similar