Improving Multimodal Brain Encoding Model with Dynamic Subject-awareness Routing
This work addresses robust brain encoding for neuroimaging studies, offering a plug-and-play framework, but it is incremental as it builds on existing multimodal fusion and routing methods.
The paper tackled the problem of naturalistic fMRI encoding with multimodal inputs and inter-subject variability by introducing AFIRE and MIND, resulting in consistent improvements over baselines, enhanced cross-subject generalization, and interpretable expert patterns.
Naturalistic fMRI encoding must handle multimodal inputs, shifting fusion styles, and pronounced inter-subject variability. We introduce AFIRE (Agnostic Framework for Multimodal fMRI Response Encoding), an agnostic interface that standardizes time-aligned post-fusion tokens from varied encoders, and MIND, a plug-and-play Mixture-of-Experts decoder with a subject-aware dynamic gating. Trained end-to-end for whole-brain prediction, AFIRE decouples the decoder from upstream fusion, while MIND combines token-dependent Top-K sparse routing with a subject prior to personalize expert usage without sacrificing generality. Experiments across multiple multimodal backbones and subjects show consistent improvements over strong baselines, enhanced cross-subject generalization, and interpretable expert patterns that correlate with content type. The framework offers a simple attachment point for new encoders and datasets, enabling robust, plug-and-improve performance for naturalistic neuroimaging studies.