Active Learning with Weak Supervision for Gaussian Processes
This work addresses data annotation costs for machine learning practitioners, but it is incremental as it builds on existing BALD objective for Gaussian Processes.
The paper tackles the problem of costly data annotation in supervised learning by proposing an active learning algorithm that selects both which observation to annotate and the precision of the annotation, allowing exploration of a larger input space with the same budget. The result is empirically demonstrated gains in model performance through this approach.
Annotating data for supervised learning can be costly. When the annotation budget is limited, active learning can be used to select and annotate those observations that are likely to give the most gain in model performance. We propose an active learning algorithm that, in addition to selecting which observation to annotate, selects the precision of the annotation that is acquired. Assuming that annotations with low precision are cheaper to obtain, this allows the model to explore a larger part of the input space, with the same annotation budget. We build our acquisition function on the previously proposed BALD objective for Gaussian Processes, and empirically demonstrate the gains of being able to adjust the annotation precision in the active learning loop.