Safe Testing
This provides a methodology for safe hypothesis testing that is acceptable across Fisherian, Neymanian, and Jeffreys-Bayesian schools, addressing the challenge of combining studies in optional continuation scenarios.
The paper tackles the problem of hypothesis testing under optional continuation, where new studies depend on previous outcomes, by developing a theory based on e-values that preserve Type-I error guarantees. It introduces growth-rate optimality (GRO) as an analogue of power, constructs GRO e-variables for composite null and alternative models, and illustrates this with examples like a safe t-test.
We develop the theory of hypothesis testing based on the e-value, a notion of evidence that, unlike the p-value, allows for effortlessly combining results from several studies in the common scenario where the decision to perform a new study may depend on previous outcomes. Tests based on e-values are safe, i.e. they preserve Type-I error guarantees, under such optional continuation. We define growth-rate optimality (GRO) as an analogue of power in an optional continuation context, and we show how to construct GRO e-variables for general testing problems with composite null and alternative, emphasizing models with nuisance parameters. GRO e-values take the form of Bayes factors with special priors. We illustrate the theory using several classic examples including a one-sample safe t-test and the 2 x 2 contingency table. Sharing Fisherian, Neymanian and Jeffreys-Bayesian interpretations, e-values may provide a methodology acceptable to adherents of all three schools.