P132 Brain Symphony: A Transformer-Driven Fusion of fMRI Time Series and Structural Connectivity
Moein Khajehnejad*1,2, Adeel Razi1,2,3
1Turner Institute for Brain & Mental Health, Monash University, Melbourne, Australia 2Monash Data Futures Institute, Monash University, Melbourne, Australia 3Wellcome Centre for Human Neuroimaging, University College London, United Kingdom
*Email: moein.khajehnejad@monash.edu
Introduction Understanding brain function requires integrating multimodal neuroimaging data to capture temporal dynamics and pairwise interactions. We propose a novel foundation model fusing fMRI time series, structural connectivity, and effective connectivity graphs using Dynamic Causal Modeling (DCM) [1] to derive robust, interpretable region-of-interest (ROI) embeddings.Our approach enables robust representation learning that generalizes across datasets and supports downstream tasks such as disease classification or detecting neural alterations induced by psychedelics. Additionally, our model identifies most influential brain regions and time intervals, facilitating interpretability in neuroscience applications. Methods
Our framework employs two self-supervised encoders. The fMRI encoder utilizes a Spatio-Temporal Transformer to model dynamic ROI embeddings. The connectivity encoder incorporates a Graph Transformer [2] and systematically evaluates multiple advanced graph-based approaches—signed Graph Neural Networks [3], Graph Attention Networks with edge sign awareness [4] and Message Passing Neural Networks with edge-type features [5]—to determine the most effective strategy for capturing excitatory and inhibitory connections for the DCM-derived graphs. To preserve causal semantics, we compare and adapt sign-aware attention and positional encodings using signed Laplacian, random walk differences, and global relational encodings, selecting the most suitable method based on empirical performance. Cross-modal attention integrates the learned embeddings from both encoders, ensuring seamless fusion across modalities. The model is pretrained on the HCP dataset, utilizing both fMRI time series and structural connectivity, and remains adaptable for other datasets incorporating different connectivity measures. Results We pretrained the model on 900 HCP participants, testing it on 67 held-out subjects and an independent psilocybin dataset (54 participants) [6]. Fig. 1.a shows accurately reconstructed fMRI time series for a test subject. Fig. 1.b presents reconstructed functional and structural connectivity maps, capturing both dynamic and anatomical relationships. Fig. 1.c visualizes low-dimensional ROI embeddings before and after psilocybin administration, revealing clear shifts only in subjects with high subjective effects (i.e. MEQ scores), indicating the model's ability to capture neural alterations. This dataset was not part of pretraining, emphasizing strong transferability and generalizability.
Discussion This scalable, interpretable framework advances multimodal integration of fMRI and distinct connectivity representations, enhancing classification and causal insight. Future work will compare diffusion-based structural connectivity with DCM-derived effective connectivity to assess the impact of causal representations on robustness in noisy datasets with latent confounders.
Figure 1. Reconstruction and representation capabilities of the multimodal foundation model. (a) Reconstructed fMRI time series for a test subject, demonstrating model accuracy. (b) Reconstructed functional and structural connectivity maps, capturing dynamic and anatomical relationships. (c) Low-dimensional ROI representations before and after psilocybin with greater shifts in high MEQ subjects. Acknowledgements A.R. is affiliated with The Wellcome Centre for Human Neuroimaging, supported by core funding from Wellcome [203147/Z/16/Z]. A.R. is a CIFAR Azrieli Global Scholar in the Brain, Mind & Consciousness Program. References [1]https://doi.org/10.1016/S1053-8119(03)00202-7 [2]https://doi.org/10.48550/arXiv.2106.05234 [3]https://doi.org/10.1109/ICDM.2018.00113 [4]https://doi.org/10.48550/arXiv.1710.10903 [5]https://doi.org/10.48550/arXiv.1704.01212 [6]https://doi.org/10.1101/2025.03.09.642197