Learning False Discovery Rate Control via Model-Based Neural Networks

arXiv:2602.05798v1h-index: 16
AI Analysis

This work addresses the issue of overly conservative FDR control methods for statisticians and researchers in fields like genomics, representing an incremental improvement over existing frameworks.

The paper tackled the problem of controlling the false discovery rate (FDR) in high-dimensional variable selection by introducing a learning-augmented enhancement to the T-Rex Selector framework, which uses a neural network to better approximate the false discovery proportion and achieves superior detection of true variables in simulations and a synthetic GWAS.

Controlling the false discovery rate (FDR) in high-dimensional variable selection requires balancing rigorous error control with statistical power. Existing methods with provable guarantees are often overly conservative, creating a persistent gap between the realized false discovery proportion (FDP) and the target FDR level. We introduce a learning-augmented enhancement of the T-Rex Selector framework that narrows this gap. Our approach replaces the analytical FDP estimator with a neural network trained solely on diverse synthetic datasets, enabling a substantially tighter and more accurate approximation of the FDP. This refinement allows the procedure to operate much closer to the desired FDR level, thereby increasing discovery power while maintaining effective approximate control. Through extensive simulations and a challenging synthetic genome-wide association study (GWAS), we demonstrate that our method achieves superior detection of true variables compared to existing approaches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes