ML LGOct 10, 2019

Learning from Indirect Observations

Yivan Zhang, Nontawat Charoenphakdee, Masashi Sugiyama

arXiv:1910.04394v14.13 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of data scarcity in machine learning by enabling integration of various weak supervision types, though it is incremental as it builds on existing weakly-supervised learning paradigms.

The paper tackles the problem of learning from diverse weak supervision signals like noisy, complementary, and coarse-grained labels by proposing a probabilistic framework and a maximum likelihood-based method, demonstrating practical usefulness in novel settings such as learning from multiclass label proportions and coarse-grained labels.

Weakly-supervised learning is a paradigm for alleviating the scarcity of labeled data by leveraging lower-quality but larger-scale supervision signals. While existing work mainly focuses on utilizing a certain type of weak supervision, we present a probabilistic framework, learning from indirect observations, for learning from a wide range of weak supervision in real-world problems, e.g., noisy labels, complementary labels and coarse-grained labels. We propose a general method based on the maximum likelihood principle, which has desirable theoretical properties and can be straightforwardly implemented for deep neural networks. Concretely, a discriminative model for the true target is used for modeling the indirect observation, which is a random variable entirely depending on the true target stochastically or deterministically. Then, maximizing the likelihood given indirect observations leads to an estimator of the true target implicitly. Comprehensive experiments for two novel problem settings --- learning from multiclass label proportions and learning from coarse-grained labels, illustrate practical usefulness of our method and demonstrate how to integrate various sources of weak supervision.

View on arXiv PDF Code

Similar