MEOct 16, 2024
Data-light Uncertainty Set Merging with AdmissibilityShenghao Qin, Jianliang He, Qi Kuang et al.
This article introduces a Synthetics, Aggregation, and Test inversion (SAT) approach for merging diverse and potentially dependent uncertainty sets into a single unified set. The procedure is data-light, relying only on initial sets and their nominal levels, and it flexibly adapts to user-specified input sets with possibly varying coverage guarantees. SAT is motivated by the challenge of integrating uncertainty sets when only the initial sets and their control levels are available-for example, when merging confidence sets from distributed sites under communication constraints or combining conformal prediction sets generated by different algorithms or data splits. To address this, SAT constructs and aggregates novel synthetic test statistics, and then derive merged sets through test inversion. Our method leverages the duality between set estimation and hypothesis testing, ensuring reliable coverage in dependent scenarios. A key theoretical contribution is a rigorous analysis of SAT's properties, including its admissibility in the context of deterministic set merging. Both theoretical analyses and empirical results confirm the method's finite-sample coverage validity and desirable set sizes.
MEMar 4, 2020
Large-Scale Shrinkage Estimation under Markovian DependenceBowen Gang, Gourab Mukherjee, Wenguang Sun
We consider the problem of simultaneous estimation of a sequence of dependent parameters that are generated from a hidden Markov model. Based on observing a noise contaminated vector of observations from such a sequence model, we consider simultaneous estimation of all the parameters irrespective of their hidden states under square error loss. We study the roles of statistical shrinkage for improved estimation of these dependent parameters. Being completely agnostic on the distributional properties of the unknown underlying Hidden Markov model, we develop a novel non-parametric shrinkage algorithm. Our proposed method elegantly combines \textit{Tweedie}-based non-parametric shrinkage ideas with efficient estimation of the hidden states under Markovian dependence. Based on extensive numerical experiments, we establish superior performance our our proposed algorithm compared to non-shrinkage based state-of-the-art parametric as well as non-parametric algorithms used in hidden Markov models. We provide decision theoretic properties of our methodology and exhibit its enhanced efficacy over popular shrinkage methods built under independence. We demonstrate the application of our methodology on real-world datasets for analyzing of temporally dependent social and economic indicators such as search trends and unemployment rates as well as estimating spatially dependent Copy Number Variations.
MEFeb 28, 2020
Structure-Adaptive Sequential Testing for Online False Discovery Rate ControlBowen Gang, Wenguang Sun, Weinan Wang
Consider the online testing of a stream of hypotheses where a real--time decision must be made before the next data point arrives. The error rate is required to be controlled at {all} decision points. Conventional \emph{simultaneous testing rules} are no longer applicable due to the more stringent error constraints and absence of future data. Moreover, the online decision--making process may come to a halt when the total error budget, or alpha--wealth, is exhausted. This work develops a new class of structure--adaptive sequential testing (SAST) rules for online false discover rate (FDR) control. A key element in our proposal is a new alpha--investment algorithm that precisely characterizes the gains and losses in sequential decision making. SAST captures time varying structures of the data stream, learns the optimal threshold adaptively in an ongoing manner and optimizes the alpha-wealth allocation across different time periods. We present theory and numerical results to show that the proposed method is valid for online FDR control and achieves substantial power gain over existing online testing rules.