Nonparametric Reduced-Rank Regression for Multi-SNP, Multi-Trait Association Mapping
This addresses the problem of identifying genetic associations for complex traits in genome-wide association studies, offering an incremental improvement over existing methods.
The paper tackles the challenge of high-dimensional multi-SNP, multi-trait association mapping in genetics by developing a nonparametric Bayesian reduced rank regression model, which improves statistical power to identify genetic associations, as demonstrated in simulations and real data like the HapMap phase 3 study.
Genome-wide association studies have proven to be essential for understanding the genetic basis of disease. However, many complex traits---personality traits, facial features, disease subtyping---are inherently high-dimensional, impeding simple approaches to association mapping. We developed a nonparametric Bayesian reduced rank regression model for multi-SNP, multi-trait association mapping that does not require the rank of the linear subspace to be specified. We show in simulations and real data that our model shares strength over SNPs and over correlated traits, improving statistical power to identify genetic associations with an interpretable, SNP-supervised low-dimensional linear projection of the high-dimensional phenotype. On the HapMap phase 3 gene expression QTL study data, we identify pleiotropic expression QTLs that classical univariate tests are underpowered to find and that two step approaches cannot recover. Our Python software, BERRRI, is publicly available at GitHub: https://github.com/ashlee1031/BERRRI.