Martin Dunsche

2papers

2 Papers

11.0STMar 23
Differentially private testing for relevant dependencies in high dimensions

Patrick Bastian, Holger Dette, Martin Dunsche

We investigate the problem of detecting dependencies between the components of a high-dimensional vector. Our approach advances the existing literature in two important respects. First, we consider the problem under privacy constraints. Second, instead of testing whether the coordinates are pairwise independent, we are interested in determining whether certain pairwise associations between the components (such as all pairwise Kendall's $τ$ coefficients) do not exceed a given threshold in absolute value. Considering hypotheses of this form is motivated by the observation that in the high-dimensional regime, it is rare and perhaps impossible to have a null hypothesis that can be modeled exactly by assuming that all pairwise associations are precisely equal to zero. The formulation of the null hypothesis as a composite hypothesis makes the problem of constructing tests already non-standard in the non-private setting. Additionally, under privacy constraints, state of the art procedures rely on permutation approaches that are rendered invalid under a composite null. We propose a novel bootstrap based methodology that is especially powerful in sparse settings, develop theoretical guarantees under mild assumptions and show that the proposed method enjoys good finite sample properties even in the high privacy regime. Additionally, we present applications in medical data that showcase the applicability of our methodology.

MEOct 15, 2021
Multivariate Mean Comparison under Differential Privacy

Martin Dunsche, Tim Kutta, Holger Dette

The comparison of multivariate population means is a central task of statistical inference. While statistical theory provides a variety of analysis tools, they usually do not protect individuals' privacy. This knowledge can create incentives for participants in a study to conceal their true data (especially for outliers), which might result in a distorted analysis. In this paper we address this problem by developing a hypothesis test for multivariate mean comparisons that guarantees differential privacy to users. The test statistic is based on the popular Hotelling's $t^2$-statistic, which has a natural interpretation in terms of the Mahalanobis distance. In order to control the type-1-error, we present a bootstrap algorithm under differential privacy that provably yields a reliable test decision. In an empirical study we demonstrate the applicability of this approach.