EMMLSep 13, 2018

Valid Simultaneous Inference in High-Dimensional Settings (with the hdm package for R)

arXiv:1809.04951v18 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the need for reliable statistical inference in high-dimensional data across disciplines like economics, but it is incremental as it primarily reviews and packages existing methods.

The paper reviews classical and modern methods for valid simultaneous inference in high-dimensional settings, such as those with many covariates or treatment heterogeneities, and demonstrates their application through a case study using the R package hdm, which implements joint hypothesis tests and confidence intervals for post-selection inference based on LASSO.

Due to the increasing availability of high-dimensional empirical applications in many research disciplines, valid simultaneous inference becomes more and more important. For instance, high-dimensional settings might arise in economic studies due to very rich data sets with many potential covariates or in the analysis of treatment heterogeneities. Also the evaluation of potentially more complicated (non-linear) functional forms of the regression relationship leads to many potential variables for which simultaneous inferential statements might be of interest. Here we provide a review of classical and modern methods for simultaneous inference in (high-dimensional) settings and illustrate their use by a case study using the R package hdm. The R package hdm implements valid joint powerful and efficient hypothesis tests for a potentially large number of coeffcients as well as the construction of simultaneous confidence intervals and, therefore, provides useful methods to perform valid post-selection inference based on the LASSO.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes