CVLGSep 12, 2023

Towards Reliable Domain Generalization: A New Dataset and Evaluations

arXiv:2309.06142v1h-index: 35
Originality Synthesis-oriented
AI Analysis

This work addresses domain generalization for researchers in machine learning, but it is incremental as it focuses on a new dataset and evaluation critique without proposing a novel method.

The authors tackled the problem of domain generalization in deep neural networks by introducing a new dataset for handwritten Chinese character recognition and evaluating existing methods, finding that current approaches perform poorly on this dataset and advocating for dynamic evaluation protocols.

There are ubiquitous distribution shifts in the real world. However, deep neural networks (DNNs) are easily biased towards the training set, which causes severe performance degradation when they receive out-of-distribution data. Many methods are studied to train models that generalize under various distribution shifts in the literature of domain generalization (DG). However, the recent DomainBed and WILDS benchmarks challenged the effectiveness of these methods. Aiming at the problems in the existing research, we propose a new domain generalization task for handwritten Chinese character recognition (HCCR) to enrich the application scenarios of DG method research. We evaluate eighteen DG methods on the proposed PaHCC (Printed and Handwritten Chinese Characters) dataset and show that the performance of existing methods on this dataset is still unsatisfactory. Besides, under a designed dynamic DG setting, we reveal more properties of DG methods and argue that only the leave-one-domain-out protocol is unreliable. We advocate that researchers in the DG community refer to dynamic performance of methods for more comprehensive and reliable evaluation. Our dataset and evaluations bring new perspectives to the community for more substantial progress. We will make our dataset public with the article published to facilitate the study of domain generalization.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes