GNLGIVQMSep 13, 2023

Tackling the dimensions in imaging genetics with CLUB-PLS

arXiv:2309.07352v2h-index: 84
AI Analysis

This work addresses the problem of missing complex brain-wide patterns in imaging genetics for researchers, though it is incremental as it builds on existing PLS methods with a cluster bootstrap enhancement.

The authors tackled the challenge of linking high-dimensional genetic and brain imaging data by introducing CLUB-PLS, a Partial Least Squares-based framework, and applied it to 33,000 subjects from the UK Biobank, identifying 107 genome-wide significant locus-phenotype pairs linked to 386 genes, with 85 pairs validated at a suggestive threshold.

A major challenge in imaging genetics and similar fields is to link high-dimensional data in one domain, e.g., genetic data, to high dimensional data in a second domain, e.g., brain imaging data. The standard approach in the area are mass univariate analyses across genetic factors and imaging phenotypes. That entails executing one genome-wide association study (GWAS) for each pre-defined imaging measure. Although this approach has been tremendously successful, one shortcoming is that phenotypes must be pre-defined. Consequently, effects that are not confined to pre-selected regions of interest or that reflect larger brain-wide patterns can easily be missed. In this work we introduce a Partial Least Squares (PLS)-based framework, which we term Cluster-Bootstrap PLS (CLUB-PLS), that can work with large input dimensions in both domains as well as with large sample sizes. One key factor of the framework is to use cluster bootstrap to provide robust statistics for single input features in both domains. We applied CLUB-PLS to investigating the genetic basis of surface area and cortical thickness in a sample of 33,000 subjects from the UK Biobank. We found 107 genome-wide significant locus-phenotype pairs that are linked to 386 different genes. We found that a vast majority of these loci could be technically validated at a high rate: using classic GWAS or Genome-Wide Inferred Statistics (GWIS) we found that 85 locus-phenotype pairs exceeded the genome-wide suggestive (P<1e-05) threshold.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes