Gated Domain-Invariant Feature Disentanglement for Domain Generalizable Object Detection
This addresses domain generalization in object detection, which is crucial for deploying models in varied real-world environments, but it is incremental as it builds on existing disentanglement techniques.
The paper tackles the problem of domain generalizable object detection by proposing a novel disentangled representation learning method that uses a channel gate module to separate domain-invariant from domain-specific features, achieving state-of-the-art performance.
For Domain Generalizable Object Detection (DGOD), Disentangled Representation Learning (DRL) helps a lot by explicitly disentangling Domain-Invariant Representations (DIR) from Domain-Specific Representations (DSR). Considering the domain category is an attribute of input data, it should be feasible for networks to fit a specific mapping which projects DSR into feature channels exclusive to domain-specific information, and thus much cleaner disentanglement of DIR from DSR can be achieved simply on channel dimension. Inspired by this idea, we propose a novel DRL method for DGOD, which is termed Gated Domain-Invariant Feature Disentanglement (GDIFD). In GDIFD, a Channel Gate Module (CGM) learns to output channel gate signals close to either 0 or 1, which can mask out the channels exclusive to domain-specific information helpful for domain recognition. With the proposed GDIFD, the backbone in our framework can fit the desired mapping easily, which enables the channel-wise disentanglement. In experiments, we demonstrate that our approach is highly effective and achieves state-of-the-art DGOD performance.