CV AIJan 2, 2025

BatStyler: Advancing Multi-category Style Generation for Source-free Domain Generalization

Xiusheng Xu, Lei Qi, Jingyang Zhou, Xin Geng

arXiv:2501.01109v110.24 citationsh-index: 5Has CodeIEEE transactions on circuits and systems for video technology (Print)

Originality Incremental advance

AI Analysis

This addresses a practical limitation in domain generalization for computer vision applications where source data is unavailable and multiple object categories exist.

The paper tackles the problem of source-free domain generalization in multi-category scenarios, where existing methods perform poorly, and proposes BatStyler to improve style synthesis efficiency and diversity, achieving state-of-the-art results on multi-category datasets.

Source-Free Domain Generalization (SFDG) aims to develop a model that performs on unseen domains without relying on any source domains. However, the implementation remains constrained due to the unavailability of training data. Research on SFDG focus on knowledge transfer of multi-modal models and style synthesis based on joint space of multiple modalities, thus eliminating the dependency on source domain images. However, existing works primarily work for multi-domain and less-category configuration, but performance on multi-domain and multi-category configuration is relatively poor. In addition, the efficiency of style synthesis also deteriorates in multi-category scenarios. How to efficiently synthesize sufficiently diverse data and apply it to multi-category configuration is a direction with greater practical value. In this paper, we propose a method called BatStyler, which is utilized to improve the capability of style synthesis in multi-category scenarios. BatStyler consists of two modules: Coarse Semantic Generation and Uniform Style Generation modules. The Coarse Semantic Generation module extracts coarse-grained semantics to prevent the compression of space for style diversity learning in multi-category configuration, while the Uniform Style Generation module provides a template of styles that are uniformly distributed in space and implements parallel training. Extensive experiments demonstrate that our method exhibits comparable performance on less-category datasets, while surpassing state-of-the-art methods on multi-category datasets.

View on arXiv PDF Code

Similar