Orthogonalized Kernel Debiased Machine Learning for Multimodal Data Analysis
This method addresses the problem of integrating multimodal data for researchers in neuroscience, offering a novel statistical approach that balances interpretability and flexibility, though it is incremental in building on existing orthogonality concepts.
The authors tackled the challenge of combining interpretability and flexibility in multimodal data analysis by proposing an orthogonalized kernel debiased machine learning approach, achieving root-N-consistency and asymptotic normality for parameter estimates with demonstrated efficacy in simulations and an Alzheimer's disease neuroimaging study.
Multimodal imaging has transformed neuroscience research. While it presents unprecedented opportunities, it also imposes serious challenges. Particularly, it is difficult to combine the merits of the interpretability attributed to a simple association model with the flexibility achieved by a highly adaptive nonlinear model. In this article, we propose an orthogonalized kernel debiased machine learning approach, which is built upon the Neyman orthogonality and a form of decomposition orthogonality, for multimodal data analysis. We target the setting that naturally arises in almost all multimodal studies, where there is a primary modality of interest, plus additional auxiliary modalities. We establish the root-$N$-consistency and asymptotic normality of the estimated primary parameter, the semi-parametric estimation efficiency, and the asymptotic validity of the confidence band of the predicted primary modality effect. Our proposal enjoys, to a good extent, both model interpretability and model flexibility. It is also considerably different from the existing statistical methods for multimodal data integration, as well as the orthogonality-based methods for high-dimensional inferences. We demonstrate the efficacy of our method through both simulations and an application to a multimodal neuroimaging study of Alzheimer's disease.