On Robust Hypothesis Testing with respect to the Hellinger Distance
Provides theoretical foundations for robust hypothesis testing under misspecification, relevant to statistics and machine learning practitioners dealing with distribution shifts.
The paper studies robust hypothesis testing when observed samples come from distributions close to, but not exactly, the specified ones. It derives a lower bound on the slack factor needed for any test to remain robust under Hellinger distance and analyzes a test for composite hypotheses defined by Hellinger balls.
We study a variant of the simple hypothesis testing problem where observed samples do not necessarily come from either of the specified distributions, but rather from a close variant of them. In this setting, we require a test that is robust to misspecification and identifies which distribution is closer in Hellinger distance. If the underlying distribution is nearly equidistant from both hypotheses, the problem becomes intractable. Our main result is a lower bound on the slack factor, which quantifies how much closer the underlying distribution must be to one hypothesis relative to the other for any test to remain robust. We also demonstrate the implications of this result for testing with respect to symmetric chi-squared distance. Finally, we study an alternative way to specify robustness, where each hypothesis is a Hellinger ball around a fixed distribution. We provide and analyze a test for this composite hypothesis testing problem.