Modeling Latent Variable Uncertainty for Loss-based Learning
This work addresses the problem of handling uncertainty in latent variables for weakly supervised learning, which is incremental as it generalizes latent SVM by modeling uncertainty and allowing loss functions that depend on latent variables.
The paper tackles parameter estimation in weakly supervised learning by separating the modeling of latent variable uncertainty during training from prediction during testing, using two distributions and a loss-based dissimilarity coefficient, and demonstrates improved performance on object and action detection tasks with concrete gains on public datasets.
We consider the problem of parameter estimation using weakly supervised datasets, where a training sample consists of the input and a partially specified annotation, which we refer to as the output. The missing information in the annotation is modeled using latent variables. Previous methods overburden a single distribution with two separate tasks: (i) modeling the uncertainty in the latent variables during training; and (ii) making accurate predictions for the output and the latent variables during testing. We propose a novel framework that separates the demands of the two tasks using two distributions: (i) a conditional distribution to model the uncertainty of the latent variables for a given input-output pair; and (ii) a delta distribution to predict the output and the latent variables for a given input. During learning, we encourage agreement between the two distributions by minimizing a loss-based dissimilarity coefficient. Our approach generalizes latent SVM in two important ways: (i) it models the uncertainty over latent variables instead of relying on a pointwise estimate; and (ii) it allows the use of loss functions that depend on latent variables, which greatly increases its applicability. We demonstrate the efficacy of our approach on two challenging problems---object detection and action detection---using publicly available datasets.