Tight Lower Bounds on Worst-Case Guarantees for Zero-Shot Learning with Attributes
This provides a foundational theoretical analysis for zero-shot learning, addressing the intrinsic difficulty of labeling novel classes without training data, which is incremental in establishing rigorous bounds.
The paper tackles the problem of zero-shot learning with attributes by developing the first non-trivial lower bound on worst-case error, which is tight and computable from the class-attribute matrix, showing it can predict confusion between classes in practice.
We develop a rigorous mathematical analysis of zero-shot learning with attributes. In this setting, the goal is to label novel classes with no training data, only detectors for attributes and a description of how those attributes are correlated with the target classes, called the class-attribute matrix. We develop the first non-trivial lower bound on the worst-case error of the best map from attributes to classes for this setting, even with perfect attribute detectors. The lower bound characterizes the theoretical intrinsic difficulty of the zero-shot problem based on the available information -- the class-attribute matrix -- and the bound is practically computable from it. Our lower bound is tight, as we show that we can always find a randomized map from attributes to classes whose expected error is upper bounded by the value of the lower bound. We show that our analysis can be predictive of how standard zero-shot methods behave in practice, including which classes will likely be confused with others.