LG CVOct 17, 2023

SODA: Robust Training of Test-Time Data Adaptors

Zige Wang, Yonggang Zhang, Zhen Fang, Long Lan, Wenjing Yang, Bo Han

arXiv:2310.11093v13.82 citationsh-index: 11

Originality Incremental advance

AI Analysis

This addresses privacy-preserving adaptation for deployed models under distribution shifts, though it appears incremental over existing test-time data adaptation methods.

The paper tackles the problem of adapting models to test distribution shifts when model parameters are inaccessible, by proposing SODA which uses high-confidence predictions to optimize data adaptors while preserving information for low-confidence cases. Results show SODA significantly enhances model performance without accessing parameters.

Adapting models deployed to test distributions can mitigate the performance degradation caused by distribution shifts. However, privacy concerns may render model parameters inaccessible. One promising approach involves utilizing zeroth-order optimization (ZOO) to train a data adaptor to adapt the test data to fit the deployed models. Nevertheless, the data adaptor trained with ZOO typically brings restricted improvements due to the potential corruption of data features caused by the data adaptor. To address this issue, we revisit ZOO in the context of test-time data adaptation. We find that the issue directly stems from the unreliable estimation of the gradients used to optimize the data adaptor, which is inherently due to the unreliable nature of the pseudo-labels assigned to the test data. Based on this observation, we propose pseudo-label-robust data adaptation (SODA) to improve the performance of data adaptation. Specifically, SODA leverages high-confidence predicted labels as reliable labels to optimize the data adaptor with ZOO for label prediction. For data with low-confidence predictions, SODA encourages the adaptor to preserve data information to mitigate data corruption. Empirical results indicate that SODA can significantly enhance the performance of deployed models in the presence of distribution shifts without requiring access to model parameters.

View on arXiv PDF

Similar