Covariance-Insured Screening
This work addresses the challenge of biomarker discovery in high-throughput biological data for researchers, though it appears incremental as it builds on existing screening methods by incorporating correlation information.
The authors tackled the problem of detecting weak signals in ultrahigh-dimensional data, where existing screening methods often miss such signals due to ignoring correlations, and they proposed a covariance-insured screening method that improved identification of jointly informative predictors, as validated through simulations and real cancer data studies.
Modern bio-technologies have produced a vast amount of high-throughput data with the number of predictors far greater than the sample size. In order to identify more novel biomarkers and understand biological mechanisms, it is vital to detect signals weakly associated with outcomes among ultrahigh-dimensional predictors. However, existing screening methods, which typically ignore correlation information, are likely to miss these weak signals. By incorporating the inter-feature dependence, we propose a covariance-insured screening methodology to identify predictors that are jointly informative but only marginally weakly associated with outcomes. The validity of the method is examined via extensive simulations and real data studies for selecting potential genetic factors related to the onset of cancer.