Puru Vaish

CV
h-index32
6papers
28citations
Novelty48%
AI Score49

6 Papers

34.3CVApr 13
Towards Brain MRI Foundation Models for the Clinic: Findings from the FOMO25 Challenge

Asbjørn Munk, Stefano Cerri, Vardan Nersesjan et al.

Clinical deployment of automated brain MRI analysis faces a fundamental challenge: clinical data is heterogeneous and noisy, and high-quality labels are prohibitively costly to obtain. Self-supervised learning (SSL) can address this by leveraging the vast amounts of unlabeled data produced in clinical workflows to train robust \textit{foundation models} that adapt out-of-domain with minimal supervision. However, the development of foundation models for brain MRI has been limited by small pretraining datasets and in-domain benchmarking focused on high-quality, research-grade data. To address this gap, we organized the FOMO25 challenge as a satellite event at MICCAI 2025. FOMO25 provided participants with a large pretraining dataset, FOMO60K, and evaluated models on data sourced directly from clinical workflows in few-shot and out-of-domain settings. Tasks covered infarct classification, meningioma segmentation, and brain age regression, and considered both models trained on FOMO60K (method track) and any data (open track). Nineteen foundation models from sixteen teams were evaluated using a standardized containerized pipeline. Results show that (a) self-supervised pretraining improves generalization on clinical data under domain shift, with the strongest models trained \textit{out-of-domain} surpassing supervised baselines trained \textit{in-domain}. (b) No single pretraining objective benefits all tasks: MAE favors segmentation, hybrid reconstruction-contrastive objectives favor classification, and (c) strong performance was achieved by small pretrained models, and improvements from scaling model size and training duration did not yield reliable benefits.

IVFeb 26
SegReg: Latent Space Regularization for Improved Medical Image Segmentation

Puru Vaish, Amin Ranem, Felix Meister et al.

Medical image segmentation models are typically optimised with voxel-wise losses that constrain predictions only in the output space. This leaves latent feature representations largely unconstrained, potentially limiting generalisation. We propose {SegReg}, a latent-space regularisation framework that operates on feature maps of U-Net models to encourage structured embeddings while remaining fully compatible with standard segmentation losses. Integrated with the nnU-Net framework, we evaluate SegReg on prostate, cardiac, and hippocampus segmentation and demonstrate consistent improvements in domain generalisation. Furthermore, we show that explicit latent regularisation improves continual learning by reducing task drift and enhancing forward transfer across sequential tasks without adding memory or any extra parameters. These results highlight latent-space regularisation as a practical approach for building more generalisable and continual-learning-ready models.

CVMar 4, 2024Code
Fourier-basis Functions to Bridge Augmentation Gap: Rethinking Frequency Augmentation in Image Classification

Puru Vaish, Shunxin Wang, Nicola Strisciuglio

Computer vision models normally witness degraded performance when deployed in real-world scenarios, due to unexpected changes in inputs that were not accounted for during training. Data augmentation is commonly used to address this issue, as it aims to increase data variety and reduce the distribution gap between training and test data. However, common visual augmentations might not guarantee extensive robustness of computer vision models. In this paper, we propose Auxiliary Fourier-basis Augmentation (AFA), a complementary technique targeting augmentation in the frequency domain and filling the augmentation gap left by visual augmentations. We demonstrate the utility of augmentation via Fourier-basis additive noise in a straightforward and efficient adversarial setting. Our results show that AFA benefits the robustness of models against common corruptions, OOD generalization, and consistency of performance of models against increasing perturbations, with negligible deficit to the standard performance of models. It can be seamlessly integrated with other augmentation techniques to further boost performance. Code and models can be found at: https://github.com/nis-research/afa-augment

CVSep 17, 2025Code
Consistent View Alignment Improves Foundation Models for 3D Medical Image Segmentation

Puru Vaish, Felix Meister, Tobias Heimann et al.

Many recent approaches in representation learning implicitly assume that uncorrelated views of a data point are sufficient to learn meaningful representations for various downstream tasks. In this work, we challenge this assumption and demonstrate that meaningful structure in the latent space does not emerge naturally. Instead, it must be explicitly induced. We propose a method that aligns representations from different views of the data to align complementary information without inducing false positives. Our experiments show that our proposed self-supervised learning method, Consistent View Alignment, improves performance for downstream tasks, highlighting the critical role of structured view alignment in learning effective representations. Our method achieved first and second place in the MICCAI 2025 SSL3D challenge when using a Primus vision transformer and ResEnc convolutional neural network, respectively. The code and pretrained model weights are released at https://github.com/Tenbatsu24/LatentCampus.

IVMay 17, 2025
Joint Manifold Learning and Optimal Transport for Dynamic Imaging

Sven Dummer, Puru Vaish, Christoph Brune

Dynamic imaging is critical for understanding and visualizing dynamic biological processes in medicine and cell biology. These applications often encounter the challenge of a limited amount of time series data and time points, which hinders learning meaningful patterns. Regularization methods provide valuable prior knowledge to address this challenge, enabling the extraction of relevant information despite the scarcity of time-series data and time points. In particular, low-dimensionality assumptions on the image manifold address sample scarcity, while time progression models, such as optimal transport (OT), provide priors on image development to mitigate the lack of time points. Existing approaches using low-dimensionality assumptions disregard a temporal prior but leverage information from multiple time series. OT-prior methods, however, incorporate the temporal prior but regularize only individual time series, ignoring information from other time series of the same image modality. In this work, we investigate the effect of integrating a low-dimensionality assumption of the underlying image manifold with an OT regularizer for time-evolving images. In particular, we propose a latent model representation of the underlying image manifold and promote consistency between this representation, the time series data, and the OT prior on the time-evolving images. We discuss the advantages of enriching OT interpolations with latent models and integrating OT priors into latent models.

CVMay 15, 2025
Data-Agnostic Augmentations for Unknown Variations: Out-of-Distribution Generalisation in MRI Segmentation

Puru Vaish, Felix Meister, Tobias Heimann et al.

Medical image segmentation models are often trained on curated datasets, leading to performance degradation when deployed in real-world clinical settings due to mismatches between training and test distributions. While data augmentation techniques are widely used to address these challenges, traditional visually consistent augmentation strategies lack the robustness needed for diverse real-world scenarios. In this work, we systematically evaluate alternative augmentation strategies, focusing on MixUp and Auxiliary Fourier Augmentation. These methods mitigate the effects of multiple variations without explicitly targeting specific sources of distribution shifts. We demonstrate how these techniques significantly improve out-of-distribution generalization and robustness to imaging variations across a wide range of transformations in cardiac cine MRI and prostate MRI segmentation. We quantitatively find that these augmentation methods enhance learned feature representations by promoting separability and compactness. Additionally, we highlight how their integration into nnU-Net training pipelines provides an easy-to-implement, effective solution for enhancing the reliability of medical segmentation models in real-world applications.