CV AI LGJan 23, 2025

Towards Robust Multimodal Open-set Test-time Adaptation via Adaptive Entropy-aware Optimization

arXiv:2501.13924v117.411 citationsh-index: 7Has CodeICLR

Originality Highly original

AI Analysis

This work addresses a challenging and highly relevant problem for real-world AI systems that need to adapt to dynamic, multimodal environments with unknown classes, representing a novel extension beyond existing unimodal methods.

The paper tackles the problem of multimodal open-set test-time adaptation (MM-OSTTA), where models must adapt online to unlabeled target data with unknown classes across multiple modalities, by proposing the Adaptive Entropy-aware Optimization (AEO) framework. The result is a new benchmark and strong performance in various domain shift scenarios, including long-term and continual settings.

Test-time adaptation (TTA) has demonstrated significant potential in addressing distribution shifts between training and testing data. Open-set test-time adaptation (OSTTA) aims to adapt a source pre-trained model online to an unlabeled target domain that contains unknown classes. This task becomes more challenging when multiple modalities are involved. Existing methods have primarily focused on unimodal OSTTA, often filtering out low-confidence samples without addressing the complexities of multimodal data. In this work, we present Adaptive Entropy-aware Optimization (AEO), a novel framework specifically designed to tackle Multimodal Open-set Test-time Adaptation (MM-OSTTA) for the first time. Our analysis shows that the entropy difference between known and unknown samples in the target domain strongly correlates with MM-OSTTA performance. To leverage this, we propose two key components: Unknown-aware Adaptive Entropy Optimization (UAE) and Adaptive Modality Prediction Discrepancy Optimization (AMP). These components enhance the ability of model to distinguish unknown class samples during online adaptation by amplifying the entropy difference between known and unknown samples. To thoroughly evaluate our proposed methods in the MM-OSTTA setting, we establish a new benchmark derived from existing datasets. This benchmark includes two downstream tasks and incorporates five modalities. Extensive experiments across various domain shift situations demonstrate the efficacy and versatility of the AEO framework. Additionally, we highlight the strong performance of AEO in long-term and continual MM-OSTTA settings, both of which are challenging and highly relevant to real-world applications. Our source code is available at https://github.com/donghao51/AEO.

View on arXiv PDF Code

Similar