CVLGJun 1, 2025

TIME: TabPFN-Integrated Multimodal Engine for Robust Tabular-Image Learning

arXiv:2506.00813v18 citationsh-index: 6
Originality Incremental advance
AI Analysis

This work addresses challenges in tabular-image multimodal learning, particularly for medical applications, by leveraging a tabular foundation model, but it is incremental as it builds on existing methods like TabPFN.

The paper tackled the problem of integrating tabular and image data for multimodal learning by addressing the lack of standardized pretrained representations and handling missing values in tabular data, resulting in a framework that consistently outperforms baselines across datasets with complete and incomplete inputs.

Tabular-image multimodal learning, which integrates structured tabular data with imaging data, holds great promise for a variety of tasks, especially in medical applications. Yet, two key challenges remain: (1) the lack of a standardized, pretrained representation for tabular data, as is commonly available in vision and language domains; and (2) the difficulty of handling missing values in the tabular modality, which are common in real-world medical datasets. To address these issues, we propose the TabPFN-Integrated Multimodal Engine (TIME), a novel multimodal framework that builds on the recently introduced tabular foundation model, TabPFN. TIME leverages TabPFN as a frozen tabular encoder to generate robust, strong embeddings that are naturally resilient to missing data, and combines them with image features from pretrained vision backbones. We explore a range of fusion strategies and tabular encoders, and evaluate our approach on both natural and medical datasets. Extensive experiments demonstrate that TIME consistently outperforms competitive baselines across both complete and incomplete tabular inputs, underscoring its practical value in real-world multimodal learning scenarios.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes