GNLGMay 30

Annotation-Informed Block-Sparse Bayesian Modeling for cis-Expression Prediction

arXiv:2606.0048364.8h-index: 15
Predicted impact top 33% in GN · last 90 daysOriginality Incremental advance
AI Analysis

For researchers using genotype-based expression prediction and TWAS, this method improves prediction accuracy and biological interpretability by leveraging LD structure and functional priors.

The authors developed bsBSLMM, an extension of BSLMM that incorporates LD-block sparsity and a TSS-informed prior, improving cis-expression prediction across 23,098 genes over multiple methods, with gains in held-out prediction and regulatory region enrichment, and enhancing TWAS discovery for inflammatory bowel disease and bone mineral density.

Genotype-based cis-expression prediction depends on accurately modeling local regulatory architecture. We present block-sparse Bayesian sparse linear mixed model (bsBSLMM), an extension of Bayesian sparse linear mixed model (BSLMM) that incorporates linkage disequilibrium (LD)-block spike-and-slab sparsity and a transcription start site (TSS)-informed SNP inclusion prior. Across 23,098 genes from GEUVADIS European-ancestry lymphoblastoid cell lines, bsBSLMM retained more predictable genes than BSLMM, LASSO, BLUP, TIGAR elastic net, and TIGAR Dirichlet-process regression under matched evaluation criteria. Compared with BSLMM, bsBSLMM improved held-out prediction performance for most shared genes, with gains driven primarily by LD-block sparsity and further enhanced by the TSS-informed prior. Variants selected by bsBSLMM showed stronger enrichment in GM12878 DNase and H3K27ac regulatory regions than variants selected by BSLMM. In transcriptome-wide association study (TWAS) analysis, bsBSLMM recovered established inflammatory bowel disease signals, including IL23R, and identified additional genome-wide significant genes not detected by BSLMM. Independent validation in the Louisiana Osteoporosis Study reproduced the increased prediction yield across ancestries and recovered biologically relevant bone mineral density pathways in downstream TWAS and gene set enrichment analyses. These results demonstrate that incorporating LD-block structure and biologically informed SNP priors improves cis-expression prediction and enhances downstream TWAS discovery.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes