LGAug 16, 2025

M3OOD: Automatic Selection of Multimodal OOD Detectors

arXiv:2508.11936v12 citationsh-index: 12
Originality Incremental advance
AI Analysis

This addresses the challenge of OOD robustness in multimodal machine learning systems, offering an incremental improvement by automating detector selection to handle diverse distribution shifts.

The paper tackles the problem of automatically selecting out-of-distribution (OOD) detectors for multimodal data by introducing M3OOD, a meta-learning framework that recommends suitable detectors based on historical performance, resulting in consistent outperformance over 10 baselines across 12 test scenarios with minimal computational overhead.

Out-of-distribution (OOD) robustness is a critical challenge for modern machine learning systems, particularly as they increasingly operate in multimodal settings involving inputs like video, audio, and sensor data. Currently, many OOD detection methods have been proposed, each with different designs targeting various distribution shifts. A single OOD detector may not prevail across all the scenarios; therefore, how can we automatically select an ideal OOD detection model for different distribution shifts? Due to the inherent unsupervised nature of the OOD detection task, it is difficult to predict model performance and find a universally Best model. Also, systematically comparing models on the new unseen data is costly or even impractical. To address this challenge, we introduce M3OOD, a meta-learning-based framework for OOD detector selection in multimodal settings. Meta learning offers a solution by learning from historical model behaviors, enabling rapid adaptation to new data distribution shifts with minimal supervision. Our approach combines multimodal embeddings with handcrafted meta-features that capture distributional and cross-modal characteristics to represent datasets. By leveraging historical performance across diverse multimodal benchmarks, M3OOD can recommend suitable detectors for a new data distribution shift. Experimental evaluation demonstrates that M3OOD consistently outperforms 10 competitive baselines across 12 test scenarios with minimal computational overhead.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes