ML IT MEDec 17, 2017

Hypothesis Testing for High-Dimensional Multinomials: A Selective Review

arXiv:1712.06120v118.068 citations

Originality Synthesis-oriented

AI Analysis

This is an incremental survey addressing statistical challenges for researchers analyzing high-dimensional discrete data.

The paper reviews recent methods for hypothesis testing in high-dimensional multinomials, arguing that traditional tests like the χ² test have poor power in this setting and that focusing on asymptotically Normal limits excludes many cases where non-Normal tests can be powerful.

The statistical analysis of discrete data has been the subject of extensive statistical research dating back to the work of Pearson. In this survey we review some recently developed methods for testing hypotheses about high-dimensional multinomials. Traditional tests like the $χ^2$ test and the likelihood ratio test can have poor power in the high-dimensional setting. Much of the research in this area has focused on finding tests with asymptotically Normal limits and developing (stringent) conditions under which tests have Normal limits. We argue that this perspective suffers from a significant deficiency: it can exclude many high-dimensional cases when - despite having non Normal null distributions - carefully designed tests can have high power. Finally, we illustrate that taking a minimax perspective and considering refinements of this perspective can lead naturally to powerful and practical tests.

View on arXiv PDF

Similar