LGAug 15, 2024Code
DATTA: Domain Diversity Aware Test-Time Adaptation for Dynamic Domain Shift Data StreamsChuyang Ye, Dongyan Wei, Zhendong Liu et al.
Test-Time Adaptation (TTA) addresses domain shifts between training and testing. However, existing methods assume a homogeneous target domain (e.g., single domain) at any given time. They fail to handle the dynamic nature of real-world data, where single-domain and multiple-domain distributions change over time. We identify that performance drops in multiple-domain scenarios are caused by batch normalization errors and gradient conflicts, which hinder adaptation. To solve these challenges, we propose Domain Diversity Adaptive Test-Time Adaptation (DATTA), the first approach to handle TTA under dynamic domain shift data streams. It is guided by a novel domain-diversity score. DATTA has three key components: a domain-diversity discriminator to recognize single- and multiple-domain patterns, domain-diversity adaptive batch normalization to combine source and test-time statistics, and domain-diversity adaptive fine-tuning to resolve gradient conflicts. Extensive experiments show that DATTA significantly outperforms state-of-the-art methods by up to 13%. Code is available at https://github.com/DYW77/DATTA.
CVNov 29, 2024
A Visual-inertial Localization Algorithm using Opportunistic Visual Beacons and Dead-Reckoning for GNSS-Denied Large-scale ApplicationsLiqiang Zhang, Ye Tian, Dongyan Wei
With the development of smart cities, the demand for continuous pedestrian navigation in large-scale urban environments has significantly increased. While global navigation satellite systems (GNSS) provide low-cost and reliable positioning services, they are often hindered in complex urban canyon environments. Thus, exploring opportunistic signals for positioning in urban areas has become a key solution. Augmented reality (AR) allows pedestrians to acquire real-time visual information. Accordingly, we propose a low-cost visual-inertial positioning solution. This method comprises a lightweight multi-scale group convolution (MSGC)-based visual place recognition (VPR) neural network, a pedestrian dead reckoning (PDR) algorithm, and a visual/inertial fusion approach based on a Kalman filter with gross error suppression. The VPR serves as a conditional observation to the Kalman filter, effectively correcting the errors accumulated through the PDR method. This enables the entire algorithm to ensure the reliability of long-term positioning in GNSS-denied areas. Extensive experimental results demonstrate that our method maintains stable positioning during large-scale movements. Compared to the lightweight MobileNetV3-based VPR method, our proposed VPR solution improves Recall@1 by at least 3\% on two public datasets while reducing the number of parameters by 63.37\%. It also achieves performance that is comparable to the VGG16-based method. The VPR-PDR algorithm improves localization accuracy by more than 40\% compared to the original PDR.
LGJun 8, 2024
Discover Your Neighbors: Advanced Stable Test-Time Adaptation in Dynamic WorldQinting Jiang, Chuyang Ye, Dongyan Wei et al.
Despite progress, deep neural networks still suffer performance declines under distribution shifts between training and test domains, leading to a substantial decrease in Quality of Experience (QoE) for multimedia applications. Existing test-time adaptation (TTA) methods are challenged by dynamic, multiple test distributions within batches. This work provides a new perspective on analyzing batch normalization techniques through class-related and class-irrelevant features, our observations reveal combining source and test batch normalization statistics robustly characterizes target distributions. However, test statistics must have high similarity. We thus propose Discover Your Neighbours (DYN), the first backward-free approach specialized for dynamic TTA. The core innovation is identifying similar samples via instance normalization statistics and clustering into groups which provides consistent class-irrelevant representations. Specifically, Our DYN consists of layer-wise instance statistics clustering (LISC) and cluster-aware batch normalization (CABN). In LISC, we perform layer-wise clustering of approximate feature samples at each BN layer by calculating the cosine similarity of instance normalization statistics across the batch. CABN then aggregates SBN and TCN statistics to collaboratively characterize the target distribution, enabling more robust representations. Experimental results validate DYN's robustness and effectiveness, demonstrating maintained performance under dynamic data stream patterns.