Strength from Weakness: Fast Learning Using Weak Supervision
This addresses the challenge of data scarcity in machine learning by leveraging abundant weak labels to improve learning efficiency, representing a theoretical advancement with practical implications.
The paper tackles the problem of weakly supervised learning, where few strong labels and many weak labels are available, showing that weak labels can accelerate the learning rate for the strong task to O(1/n) from O(1/√n), with empirical validation across tasks.
We study generalization properties of weakly supervised learning. That is, learning where only a few "strong" labels (the actual target of our prediction) are present but many more "weak" labels are available. In particular, we show that having access to weak labels can significantly accelerate the learning rate for the strong task to the fast rate of $\mathcal{O}(\nicefrac1n)$, where $n$ denotes the number of strongly labeled data points. This acceleration can happen even if by itself the strongly labeled data admits only the slower $\mathcal{O}(\nicefrac{1}{\sqrt{n}})$ rate. The actual acceleration depends continuously on the number of weak labels available, and on the relation between the two tasks. Our theoretical results are reflected empirically across a range of tasks and illustrate how weak labels speed up learning on the strong task.