CV AI IVFeb 5, 2025

Tell2Reg: Establishing spatial correspondence between images by the same language prompts

Wen Yan, Qianye Yang, Shiqi Huang, Yipei Wang, Shonit Punwani, Mark Emberton, Vasilis Stavrinides, Yipeng Hu, Dean Barratt

arXiv:2502.03118v13.6h-index: 91Has CodeISBI

Originality Highly original

AI Analysis

This provides a training-free, generalizable solution for image registration tasks, particularly beneficial for medical imaging where data curation is costly.

The paper tackles the problem of establishing spatial correspondence between images by using the same language prompts with pre-trained multimodal models, eliminating the need for training or labeled data. It demonstrates this approach on inter-subject prostate MR image registration, outperforming unsupervised methods and achieving performance comparable to weakly-supervised methods.

Spatial correspondence can be represented by pairs of segmented regions, such that the image registration networks aim to segment corresponding regions rather than predicting displacement fields or transformation parameters. In this work, we show that such a corresponding region pair can be predicted by the same language prompt on two different images using the pre-trained large multimodal models based on GroundingDINO and SAM. This enables a fully automated and training-free registration algorithm, potentially generalisable to a wide range of image registration tasks. In this paper, we present experimental results using one of the challenging tasks, registering inter-subject prostate MR images, which involves both highly variable intensity and morphology between patients. Tell2Reg is training-free, eliminating the need for costly and time-consuming data curation and labelling that was previously required for this registration task. This approach outperforms unsupervised learning-based registration methods tested, and has a performance comparable to weakly-supervised methods. Additional qualitative results are also presented to suggest that, for the first time, there is a potential correlation between language semantics and spatial correspondence, including the spatial invariance in language-prompted regions and the difference in language prompts between the obtained local and global correspondences. Code is available at https://github.com/yanwenCi/Tell2Reg.git.

View on arXiv PDF Code

Similar