CVJul 12, 2023
SwiFT: Swin 4D fMRI TransformerPeter Yongho Kim, Junbeom Kwon, Sunghwan Joo et al.
Modeling spatiotemporal brain dynamics from high-dimensional data, such as functional Magnetic Resonance Imaging (fMRI), is a formidable task in neuroscience. Existing approaches for fMRI analysis utilize hand-crafted features, but the process of feature extraction risks losing essential information in fMRI scans. To address this challenge, we present SwiFT (Swin 4D fMRI Transformer), a Swin Transformer architecture that can learn brain dynamics directly from fMRI volumes in a memory and computation-efficient manner. SwiFT achieves this by implementing a 4D window multi-head self-attention mechanism and absolute positional embeddings. We evaluate SwiFT using multiple large-scale resting-state fMRI datasets, including the Human Connectome Project (HCP), Adolescent Brain Cognitive Development (ABCD), and UK Biobank (UKB) datasets, to predict sex, age, and cognitive intelligence. Our experimental outcomes reveal that SwiFT consistently outperforms recent state-of-the-art models. Furthermore, by leveraging its end-to-end learning capability, we show that contrastive loss-based self-supervised pre-training of SwiFT can enhance performance on downstream tasks. Additionally, we employ an explainable AI method to identify the brain regions associated with sex classification. To our knowledge, SwiFT is the first Swin Transformer architecture to process dimensional spatiotemporal brain functional data in an end-to-end fashion. Our work holds substantial potential in facilitating scalable learning of functional brain imaging in neuroscience research by reducing the hurdles associated with applying Transformer models to high-dimensional fMRI.
NCMar 30, 2025
Spatiotemporal Learning of Brain Dynamics from fMRI Using Frequency-Specific Multi-Band Attention for Cognitive and Psychiatric ApplicationsSangyoon Bae, Junbeom Kwon, Shinjae Yoo et al.
Understanding how the brain's complex nonlinear dynamics give rise to cognitive function remains a central challenge in neuroscience. While brain functional dynamics exhibits scale-free and multifractal properties across temporal scales, conventional neuroimaging analytics assume linearity and stationarity, failing to capture frequency-specific neural computations. Here, we introduce Multi-Band Brain Net (MBBN), the first transformer-based framework to explicitly model frequency-specific spatiotemporal brain dynamics from fMRI. MBBN integrates biologically-grounded frequency decomposition with multi-band self-attention mechanisms, enabling discovery of previously undetectable frequency-dependent network interactions. Trained on 49,673 individuals across three large-scale cohorts (UK Biobank, ABCD, ABIDE), MBBN sets a new state-of-the-art in predicting psychiatric and cognitive outcomes (depression, ADHD, ASD), showing particular strength in classification tasks with up to 52.5\% higher AUROC and provides a novel framework for predicting cognitive intelligence scores. Frequency-resolved analyses uncover disorder-specific signatures: in ADHD, high-frequency fronto-sensorimotor connectivity is attenuated and opercular somatosensory nodes emerge as dynamic hubs; in ASD, orbitofrontal-somatosensory circuits show focal high-frequency disruption together with enhanced ultra-low-frequency coupling between the temporo-parietal junction and prefrontal cortex. By integrating scale-aware neural dynamics with deep learning, MBBN delivers more accurate and interpretable biomarkers, opening avenues for precision psychiatry and developmental neuroscience.
IVDec 15, 2024
Macro2Micro: A Rapid and Precise Cross-modal Magnetic Resonance Imaging Synthesis using Multi-scale Structural Brain SimilaritySooyoung Kim, Joonwoo Kwon, Junbeom Kwon et al.
The human brain is a complex system requiring both macroscopic and microscopic components for comprehensive understanding. However, mapping nonlinear relationships between these scales remains challenging due to technical limitations and the high cost of multimodal Magnetic Resonance Imaging (MRI) acquisition. To address this, we introduce Macro2Micro, a deep learning framework that predicts brain microstructure from macrostructure using a Generative Adversarial Network (GAN). Based on the hypothesis that microscale structural information can be inferred from macroscale structures, Macro2Micro explicitly encodes multiscale brain information into distinct processing branches. To enhance artifact elimination and output quality, we propose a simple yet effective auxiliary discriminator and learning objective. Extensive experiments demonstrated that Macro2Micro faithfully translates T1-weighted MRIs into corresponding Fractional Anisotropy (FA) images, achieving a 6.8\% improvement in the Structural Similarity Index Measure (SSIM) compared to previous methods, while retaining the individual biological characteristics of the brain. With an inference time of less than 0.01 seconds per MR modality translation, Macro2Micro introduces the potential for real-time multimodal rendering in medical and research applications. The code will be made available upon acceptance.