Learning Attributes Equals Multi-Source Domain Generalization
This work addresses the need for cross-category generalizable attribute detectors in computer vision, which is an incremental improvement over existing methods.
The paper tackles the under-explored problem of robustly detecting visual attributes across different categories, including unseen ones, by framing it as a multi-source domain generalization task. They validated their approach with extensive experiments on four datasets and three problems, achieving competitive results.
Attributes possess appealing properties and benefit many computer vision problems, such as object recognition, learning with humans in the loop, and image retrieval. Whereas the existing work mainly pursues utilizing attributes for various computer vision problems, we contend that the most basic problem---how to accurately and robustly detect attributes from images---has been left under explored. Especially, the existing work rarely explicitly tackles the need that attribute detectors should generalize well across different categories, including those previously unseen. Noting that this is analogous to the objective of multi-source domain generalization, if we treat each category as a domain, we provide a novel perspective to attribute detection and propose to gear the techniques in multi-source domain generalization for the purpose of learning cross-category generalizable attribute detectors. We validate our understanding and approach with extensive experiments on four challenging datasets and three different problems.